Rust Functions, Modules, Packages, Crates, and You

Wooden pallets stacked one on top another
I know the code is in here… somewhere.

Come to find out, I’m learning Rust from old documentation. Both of the printed Rust books I have are for the pre-“2018 edition” and I think that’s contributing to some confusion I have about functions, modules, packages, and crates. A new version of the official book is coming out in the next month or so – I have a link to it through Amazon in the right sidebar. If you’ve been reading the online documentation, you’re ok – it is updated fir the “2018-edition”. I’ve looked at some of these parts of Rust before, but I recently found another new resource, the Edition Guide, which clears up some of my issues. Especially of interest here, is the section on Path Clarity which heavily influenced by RFC 2126 that improved this part of Rust.

I learned some of the history (and excitement) of RFC 2126 while listening to the Request for Explanation podcast, episode 10. Anyway, let’s go back to basics and have a look at Rust functions, modules, packages and crates as the language sits in mid-2019. I’ll present some examples from my web application we’ve been looking at. I’m going to cut out unnecessary bits to simplify things, so a “…” means there was more there in order for this to compile. You can always see whatever state it happens to be in, here.

Crates and Packages

A Rust crate (like Rocket or Diesel) is a binary or library of compiled code. A binary crate is runnable while a library crate is used for its functionality by being linked with another binary. A package (like my web app) ties together one or more crates with a single Cargo.toml file. The toml file configures the package‘s dependencies and some minimal information about compiling the source. A binary crate will have a src/main.rs with a main() function which directs how the binary runs. A library crate will have a src/lib.rs which is the top layer of the library. This top layer directs which pieces inside are available to users of the library.

Rust Functions

Functions are easy – subroutines in your source code. A function starts with fn, possibly receives some parameters and might return a value. Also, a function may be scoped as public or kept private. The main() function inside src/main.rs is a special function that runs when the binary is called from the command line. It dictates the start of your program and you take control from there. You may create other functions, just avoid reserved words (or use the r# prefix to indicate you mean YOUR function, not the reserved word, for instance r#expect if you want to name a function “expect”). Very similar to functions, are methods and traits, which we’ve looked at before.

<src/lib.rs>

...
use diesel::prelude::*;
...
pub fn setup_db() -> PgConnection {
    PgConnection::establish(&CONFIG.database_url)
        .expect(&format!("Error connecting to db"))
}

setup_db() is a fairly simple function – it accepts no incoming parameters and returns a database connection struct called PgConnection. It has pub before fn to indicate it is a “public” function. Without that, my web application bin/src/pps.rs could not call this function – it would not be in scope. Without pub, setup_db() would only be callable from within src/lib.rs. Since I am designing my application as a library crate, I choose to put setup_db() in the main src/lib.rs file. My binary that I will use to “run” my web application is in src/bin/pps.rs and contains a main() function.

Let’s look at the return type, PgConnection. This is a struct defined by the database ORM library crate, Diesel. The only way I could write a function that returns this particular type of struct is because I have use diesel::prelude::*; at the top (and it’s in the toml file as well). The Diesel library crate provides prelude as a simple way to bring in all Diesel has to offer my package. Diesel provides the PgConnection struct as public (or what good would the crate be), so I can now use that struct in my code. This also gives me the (method or trait, how can you tell?) establish(). Just like you’d call String::new() for a new string, I’m calling PgConnection::establish() for a new database connection and then returning it (see, no trailing ; on the line).




Rust Modules

Functions (and other things) can be grouped together into a Module. For instance, setup_logging() is also in src/lib.rs. However, I could have wrapped it inside a named module, like so:

<src/lib.rs>

...
pub mod setting_up {
    ...
    use logging::LOGGING;
    use settings::CONFIG;

    pub fn setup_logging() {
        let applogger = &LOGGING.logger;

        let run_level = &CONFIG.server.run_level;
        warn!(applogger, "Service starting"; "run_level" => run_level);
    }
}

Now it is part of my setting_up module. Here also, the module needs to be pub so that my application can use it and the public functions inside it. Now all of the enums and structs and functions inside the module setting_up are contained together. As long as they are public, I can still get to them in my application.

Notice I use logging::LOGGING; and use settings::CONFIG; These bring in those two structs so I can use the global statics that are built when then the application starts. I included pub mod logging; and pub mod settings; at the top level, in src/lib.rs, so they are available anyplace deeper in my app. I just need to use them since I reference them in this module’s code.

Splitting firewood with an axe

Split, for Clarity

On the other hand, instead of defining a module, or multiple modules, inside a single file like above, you can use a different file to signify a module. This helps split out and separate your code, making it easier to take in a bit at a time. I did that here, with logging.rs:

<src/logging.rs>

...
use slog::{FnValue, *};

pub struct Logging {
    pub logger: slog::Logger,
}

pub static LOGGING: Lazy<Logging> = Lazy::new(|| {
    let logconfig = &CONFIG.logconfig;

    let logfile = &logconfig.applog_path;
    let file = OpenOptions::new()
        .create(true)
        .write(true)
        .truncate(true)
        .open(logfile)
        .unwrap();

    let applogger = slog::Logger::root(
        Mutex::new(slog_bunyan::default(file)).fuse(),
        o!("location" => FnValue(move |info| {
        format!("{}:{} {}", info.file(), info.line(), info.module(), )
                })
        ),
    );

    Logging { logger: applogger }
});

I have a struct and a static instance of it, both of them public, defined in logging.rs. logging.rs becomes a module of my library crate when I specify it. At the top of src/lib.rs I have pub mod logging; which indicates my library crate uses that module file logging.rs and “exports” what it gets from that module as public (so my bin/src/pps.rs application can use what it provides).

In this case, you also see I use slog::{FnValue, *}}; which is like use slog::FnValue; (which I need for the FnValue struct) and use slog::*; which gives me the fuse struct and the o! macro. I was able to combine those into a single use statement to get just what I needed from that external crate.

The old books I have been referencing have you declaring the third-party crates you want to use in your application in your Cargo.toml file (which is still required), but also you’d have to bring each one in with an extern crate each_crate; at the top of main.rs or lib.rs. Thankfully, that’s no longer needed… 99% of the time. In fact, I had a long list of those myself – I am surprised cargo build didn’t warn me it was unneeded. Actually, I do have one crate I am using which still needs this “2015-edition” requirement: Diesel. Apparently, it is doing some fancy macro work and/or hasn’t been upgraded (yet?) for the “2018-edition” of Rust, so at the top of src/lib.rs, I have:

#[macro_use]
extern crate diesel;

A Few Standards and TOMLs

The Rust crate std is the standard library, and is included automatically. The primitive data types and a healthy list of macros and keywords are all included. But, if you need filesystem tools: use std::fs; and if you need a HashMap variable, you’ll need to use std::collections::HashMap; And yes, all external crates you depend on inside your source will need to be listed in Cargo.toml. This configuration helps you though – it updates crates automatically as minor versions become available, but does NOT update if a major version is released. You will need to do that manually, so you can test to see if the major release broke anything you depended on in your code. Here is a piece of my ever-growing Cargo.toml file for the web application so far:

...
[dependencies]
slog = "2.5.0"
slog-bunyan = "2.1.0"
base64 = "0.10.1"
rand = "0.7.0"
rand_core = "0.5.0"
rust-crypto = "0.2.36"
config = "0.9.3"
serde = "1.0.94"
serde_derive = "1.0.94"
serde_json = "1.0.40"
once_cell = "0.2.2"
dotenv = "0.14.1"
chrono = "0.4.7"
rocket = "0.4.2"
rocket-slog = "0.4.0"

[dependencies.diesel]
version = "1.4.2"
features = ["postgres","chrono"]

[dependencies.rocket_contrib]
version = "0.4.2"
default-features = false
features = ["serve","handlebars_templates","helmet","json"]

Author: Jeff Culverhouse

I am a remote Sr Software Engineer for ZipRecruiter.com, mainly perl. Learning Rust in my spare time. Plus taking classes at James Madison University. Culverhouse - English: from Old English culfrehūs ‘dovecote’, hence a topographic name for someone living near a dovecote, or possibly a metonymic occupational name for the keeper of a dovecote. ISTP, occasionally INTP.