(A Few) Advanced Variable Types in Rust

Get a firm grasp of each of these smart pointers and other advanced variables in Rust: Box, Cell, RefCell, Rc, Arc, RwLock, Mutex, OnceCell (and there are others!)

Programmer on laptop, presumably trying to figure out how to get access to his variable in several threads of his Rust program.
Keep one eye on your code at all times!

“I haven’t seen Evil Dead II yet”. Much is made about this simple question in the movie adaption of High Fidelity. Does “yet” mean the person does, indeed, intend to see the film? Jack Black’s character is having real trouble with the concept – not only does he know that the speaker, John Cusack’s character, has seen Evil Dead II, but what idiot wouldn’t see it, “because it’s a brilliant film. It’s so funny, and violent, and the soundtrack kicks so much ass.” I love this exchange, but I’m a fan of the film anyway. It is not always clear to me how to handle advanced variable types Rust, yet.

I think of these as wrappers that add abilities (and restrictions) to a variable. They give a variable super powers since the Rust compiler is so strict about what you can and can’t do with variables.


Box<T>

PROVIDES:
Smart pointer that forces your variable’s value to be stored on the heap instead of the stack. The Box<> variable itself is just a pointer so its size is obvious and can, itself, be stored on the stack.

RESTRICTIONS:

USEFUL WHEN:
If the size of an item cannot be determined at compile time it will complain if the default is to store it on the stack (where a calculable size is necessary). Using Box<> will force the storage on the heap where a fixed size is not needed. For example, a recursive data-structure, including enums, will not work on the stack because a concrete size cannot be calculated. Turning the recursive field into a Box<> means it stores a pointer which CAN be sized. The example in the docs being:

enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

Also useful if you have a very large-sized T, and want to transfer ownership of that variable without it being copied each time.

NOTABLY PROVIDES:
just see the rust-lang.org docs

EXAMPLES/DISCUSSION:
https://doc.rust-lang.org/stable/rust-by-example/std/box.html
https://www.koderhq.com/tutorial/rust/smart-pointer/
https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/

Setting the value of a simple Box<> variable is easy enough and getting the
value back looks very normal:

fn main() {
    let answer = Box::new(42);
    println!("The answer is : {}", answer);
}



Cell<T>

PROVIDES:
You can have multiple, shared references to the Cell<>(and thus, access to the value inside with .get()) and yet still mutate the value inside (with .set()). This is called interior mutability because the value inside can be changed but mut on the Cell<> itself is not needed. The inner value can only be set by calling a method on the Cell<>.

RESTRICTIONS:
It is not possible to get a reference to what is inside the Cell, only a copy of the value. Also, Cell does not implement sync, so it cannot be given to a different thread, which ensures safety.

USEFUL WHEN:
Usually used for small values, such as counters or flags, where you need multiple shared references to the value AND be allowed to mutate it at the same time, in a guaranteed safe way.

NOTABLY PROVIDES:
.set() to set the value inside
.get() to get a copy of the value inside
.take() to get a copy of the value inside AND reset the value inside to default.
see the rust-lang.org docs

EXAMPLES/DISCUSSION:
https://hub.packtpub.com/shared-pointers-in-rust-challenges-solutions/
https://ricardomartins.cc/2016/06/08/interior-mutability

Setting the inner value of a Cell<> is only possible with a method call which is how it maintains safety:

use std::cell::Cell;
fn main() {
    let answer = Cell::new(0);
    answer.set(42);
    println!("The answer is : {}", answer.get());
}

RefCell<T>

PROVIDES:
RefCell<> is very similar to Cell<> except it adds borrow checking, but at run-time instead of compile time! This means, unlike Cell<>, it is possible to write RefCell<> code which will panic!(). You borrow() a ref to the inner value for read-only or borrow_mut() in order to change it.

RESTRICTIONS:
borrow() will panic if a borrow_mut() is in place, and borrow_mut() will panic if either type is in place.

USEFUL WHEN:

NOTABLY PROVIDES:
.borrow() to get a copy of the value at the ref
.borrow_mut() to set the value at the ref
.try_borrow() and .try_borrow_mut() will return a Result<> or error instead of a panic!().
see the rust-lang.org docs

EXAMPLES/DISCUSSION:
https://ricardomartins.cc/2016/06/08/interior-mutability (again)

You must successfully borrow_mut() the RefCell<> in order to set the value (by dereferencing) and then simply borrow() it to retrieve the value:

use std::cell::RefCell;
fn main() {
    let answer = RefCell::new(0);
    *answer.borrow_mut() = 42;
    println!("The answer is : {}", answer.borrow());
}

whereas, something as simple as this compiles, but panics at run-time. Imagine how much more obscure this code could be. Remember, any number of read-only references or exactly 1 read-write reference and nothing else – although for RefCell, this is enforced at run-time:

use std::cell::RefCell;
fn main() {
    let answer = RefCell::new(0);
    let break_things = answer.borrow_mut();
    println!("The initial value is : {}", *break_things);
    *answer.borrow_mut() = 42;
    println!("The answer is : {}", answer.borrow());
}

Rc<T>

PROVIDES:
Adds the feature of run-time reference counting to your variable, but this is the simple, lower-cost version – it is not thread safe.

RESTRICTIONS:
Right from the docs “you cannot generally obtain a mutable reference to something inside an Rc. If you need mutability, put a Cell or RefCell inside the Rc“. So while there is a get_mut() method, it’s easy to just use a Cell<> inside.

USEFUL WHEN:
You need run-time reference counting of a variable so it hangs around until the last reference of it is gone.

NOTABLY PROVIDES:
.clone() – get a new copy of the pointer to the same value, upping the reference count by 1.
see the rust-lang.org docs

EXAMPLES/DISCUSSION:
https://blog.sentry.io/2018/04/05/you-cant-rust-that#refcounts-are-not-dirty

Note that in the example below, my_answer is still pointing to valid memory even when correct_answer is dropped, because the Rc<> had an internal count of “2” and drops it to “1”, leaving the storage of “42” still valid.

use std::rc::Rc;
fn main() {
    let correct_answer = Rc::new(42);
    let my_answer = Rc::clone(&correct_answer);

    println!("The correct answer is : {}", correct_answer);
    drop(correct_answer);

    println!("And you got : {}", my_answer);
}

Arc<T>

PROVIDES:
Arc<> is an atomic reference counter, very similar to Rc<> above but thread-safe.

RESTRICTIONS:
More expensive than Rc<>. Also note, the <T> you store must have the Send and Sync traits. So an Arc<RefCell<T>> will not work because RefCell<> is not Sync.

USEFUL WHEN:
Same as Rc<>, You need run-time reference counting of a variable so it hangs around until the last reference of it is gone, but safe across threads as long as the inner <T> is.

NOTABLY PROVIDES:
see the rust-lang.org docs

EXAMPLES/DISCUSSION:
https://medium.com/@DylanKerler1/how-arc-works-in-rust-b06192acd0a6

Same idea as with Rc<>, we just show it working across multiple threads (and then sleep for just 10ms to let those threads finish).

use std::sync::Arc;
use std::thread;
use std::time::Duration;
fn main() {
    let answer = Arc::new(42);

    for threadno in 0..5 {
        let answer = Arc::clone(&answer);
        thread::spawn(move || {
            println!("Thread {}, answer is: {}", threadno + 1, answer);
        });
    }
    let ten_ms = Duration::from_millis(10);
    thread::sleep(ten_ms);
}



Mutex<T>

PROVIDES:
Mutual exclusion lock protecting shared data, even across threads.

RESTRICTIONS:
Any thread which panics will “poison” the Mutex<> and make it inaccessible to all threads. The T stored must allow Send but Sync is not necessary.

USEFUL WHEN:
working on it!

NOTABLY PROVIDES:
see the rust-lang.org docs

EXAMPLES/DISCUSSION:
https://doc.rust-lang.org/book/ch16-03-shared-state.html

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
    let answer = Arc::new(Mutex::new(42));

    for thread_no in 0..5 {
        let changer = Arc::clone(&answer);
        thread::spawn(move || {
            let mut changer = changer.lock().unwrap();
            println!("Setting answer to thread_no: {}", thread_no + 1,);
            *changer = thread_no + 1;
        });
    }
    let ten_ms = Duration::from_millis(10);
    thread::sleep(ten_ms);

    if answer.is_poisoned() {
        println!("Mutex was poisoned :(");
    } else {
        println!("Mutex survived :)");
        let final_answer = answer.lock().unwrap();
        println!("Ended with answer: {}", final_answer);
    }
}

RwLock<T>

PROVIDES:
Similar to RefCell, but thread safe. borrow() is read(), borrow_mut is write(). They don’t return an option, they will block until they do get the lock.

RESTRICTIONS:
Any thread which panics while a write lock is in place will “poison” the RwLock<> and make it inaccessible to all threads. A panic! during a read lock does not poison the RwLock. The T stored must allow both Send and Sync.

USEFUL WHEN:
working on it!

NOTABLY PROVIDES:
see the rust-lang.org docs

EXAMPLES/DISCUSSION:

Slightly fancier example, that shows getting both read() and write() locks on the value. If nothing panics, we should see the answer at the end.

use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Duration;
fn main() {
    let answer = Arc::new(RwLock::new(42));

    for thread_no in 0..5 {
        if thread_no % 2 == 1 {
            let changer = Arc::clone(&answer);
            thread::spawn(move || {
                let mut changer = changer.write().unwrap();
                println!("Setting answer to thread_no: {}", thread_no + 1,);
                *changer = thread_no + 1;
            });
        } else {
            let reader = Arc::clone(&answer);
            thread::spawn(move || {
                let reader = reader.read().unwrap();
                println!(
                    "Checking  answer in thread_no: {}, value is {}",
                    thread_no + 1,
                    *reader
                );
            });
        }
    }
    let ten_ms = Duration::from_millis(10);
    thread::sleep(ten_ms);

    if answer.is_poisoned() {
        println!("Mutex was poisoned :(");
    } else {
        println!("Mutex survived :)");
        let final_answer = answer.read().unwrap();
        println!("Ended with answer: {}", final_answer);
    }
}
Checking answer in thread_no: 1, value is 42
Checking answer in thread_no: 3, value is 42
Setting answer to thread_no: 2
Checking answer in thread_no: 5, value is 2
Setting answer to thread_no: 4
Mutex survived :)
Ended with answer: 4

Summary

There are more, plus many custom types, some I’ve even used like the crate once_cell. I started using that for the web app I was (am?) working on and wrote a little about it. Also, as you saw in the last two examples, you can combine types when you need multiple functionalities. I have included these examples in a GitHub repo, pointers.

I’ll probably hear about or (much more slowly) learn about mistakes I’ve made in wording here or come up with much better examples and excuses for using these various types, so I’ll try to update this post as I do. I see using this myself as a reference until I am really familiar with each of these types. Obviously, any mistakes here are mine alone as I learn Rust and not from any of the links or sources I listed!

Also, lots of help from 3 YouTubers I’ve been watching – the best examples can been seen as they write code and explain why they need something inside an Rc<> or in a Mutex<>. Check out their streams and watch over their shoulder as they code!!

Author: Jeff Culverhouse

I am a remote Sr Software Engineer for ZipRecruiter.com, mainly perl. Learning Rust in my spare time. Plus taking classes at James Madison University. Culverhouse - English: from Old English culfrehūs ‘dovecote’, hence a topographic name for someone living near a dovecote, or possibly a metonymic occupational name for the keeper of a dovecote. ISTP, occasionally INTP.

One thought on “(A Few) Advanced Variable Types in Rust”

  1. great post. I feel the docs really do not help much. Rust is for serious twiddly stuff. The books discuss the basic syntax but dont drill into what real programmers need to know for real solutions. Especially experienced devs from other languages (me c++, c#). Stuff thats automagic in c# and fine once you work out how to use smart pointers in c++ is way under explained in rust docs. One second I have let &mut x = 42; All of a sudden I have Foo, Box() + Send>, &***foo. I mean WTF. The problem is that rust is complicated so they try to do some things automatically, auto deref, elided lifetimes,… tra la la all happy, then bam you are in the deep end. Scuse the rant – Great post. Do one on why I need Option all over the place

Comments are closed.