A pointer that cleans up after itself walks into a scope. Nothing leaks.
A smart pointer is a type that owns a value on the heap and runs
cleanup automatically when it goes out of scope. The "smart" part is
the cleanup: there's no free, no delete, no dispose(). When the
owning binding drops, the destructor runs and the memory comes back.
If you've used C++, you already know the idea. Box<T> is Rust's
std::unique_ptr<T>: one owner, freed on drop. Rc<T> is
std::shared_ptr<T>: shared ownership via a reference count. Both
patterns are an instance of RAII (Resource Acquisition Is
Initialization): tie the lifetime of a resource to the lifetime of a
stack value, and let scope exit do the cleanup.
If your background is Java, Python, or JavaScript, the comparison is
trickier. A String in Java is a reference to a heap object that the
garbage collector reclaims when no one is looking at it anymore. Those
references are not smart pointers in the Rust sense: there is no single
owner, and you don't know when (or whether) cleanup happens. Rust's
smart pointers give you the heap allocation without the GC, because
ownership tells the compiler exactly when to drop.
Box<T>: heap allocation with a single owner. The simplest
smart pointer. You give it a value, it puts it on the heap, and it
frees it when the box goes out of scope.Box gives the compiler a fixed-size handle
to put in the struct.Box<dyn Trait>. Picking up where Chapter 14 left off. A
trait object like dyn Shape doesn't have a known size, so it
has to live behind a pointer. Box<dyn Shape> is the owned form,
and it's what lets you store a Vec of mixed concrete types that
all implement the same trait. The ? operator chapter (Chapter
17) uses the same trick with Box<dyn Error>.Real codebases reach for two other smart pointers often enough that
they deserve a mention here, even though the exercises in this
chapter focus on Box.
Rc<T>: shared ownership, single-threadedBox<T> has exactly one owner. Sometimes you genuinely need several
parts of a program to share ownership of the same value, and you
can't predict which one will be the last to let go. Rc<T>
("reference counted") tracks the number of owners and drops the
value when the count hits zero.
use std::rc::Rc;
let a = Rc::new(String::from("shared"));
let b = Rc::clone(&a); // bumps the count to 2, no deep copy
let c = Rc::clone(&a); // count is now 3
// When a, b, and c all go out of scope, the String is dropped.
Rc is single-threaded. The multi-threaded equivalent is Arc<T>
("atomically reference counted"), which you'll meet when concurrency
shows up. C++ devs: Rc is shared_ptr without the atomic overhead,
Arc is shared_ptr with it.
RefCell<T>: interior mutabilityThe borrow checker normally enforces "one mutable reference or many
immutable ones" at compile time. RefCell<T> moves that check to
runtime: you can hand out an &RefCell<T>, and someone holding it
can still mutate the inner value by calling .borrow_mut(). If the
borrowing rules are violated, the program panics instead of failing
to compile.
If you're coming from Java, this is close to a field with a private
setter: outside code holds an immutable handle to the object, but
the object can still mutate itself. You will not need RefCell for
a long time. It pairs with Rc to build graph-shaped data, and it
shows up in some testing patterns. Mentioned here so you recognize
the name when you see it.
Box::new(value) allocates space on the heap, moves value into it,
and hands you back a Box<T> that owns that allocation. When the
box goes out of scope, Rust drops the inner value and frees the
memory. No free, no leaks.
A Box<T> acts like the value it holds. The * operator
dereferences it, and most method calls work through automatic
dereferencing without you writing * at all.
let boxed: Box<i32> = Box::new(7);
let n: i32 = *boxed; // explicit deref
assert_eq!(n + 1, 8);
Why bother boxing a tiny i32? You usually wouldn't. Box earns
its keep when:
For this exercise we keep it simple: take two boxed integers, add them, return the sum.
Useful from the standard library
Box::newis the only constructor you need here.- Deref with
*aand*b, or rely on auto-deref and just write*a + *b. Both work becausei32isCopy, so reading through the box doesn't move anything out.
*a reads the i32 out of the box.i32 is Copy, so reading through the box doesn't move anything
out. You can use *a and *b as many times as you want.*a + *b
Try to imagine this enum without the Box:
enum Expr {
Num(i32),
Add(Expr, Expr), // a node holds two whole sub-expressions inline
Mul(Expr, Expr),
}
The compiler has to decide how many bytes one Expr occupies.
Add is at least two Exprs, each of which is at least two
Exprs, which is... you see the problem. The size is infinite,
and the compiler refuses to lay out the type.
Box<Expr> fixes it. A Box is always one pointer wide, regardless
of what it points to, so the compiler now knows that Add is two
pointers' worth of memory. The actual sub-expressions live on the
heap, reached through those pointers.
enum Expr {
Num(i32),
Add(Box<Expr>, Box<Expr>),
Mul(Box<Expr>, Box<Expr>),
}
This is the same trick C uses with struct node { struct node *l; struct node *r; } and that Java/C# get for free because every
object is already a reference. Rust just wants you to ask for the
indirection explicitly.
Expr is a tiny expression tree: a value is either a literal
number, the sum of two sub-expressions, or the product of two
sub-expressions. This is how every interpreter and every calculator
represents code internally. Parsing text like "(1 + 2) * 4"
produces an Expr tree; evaluating that tree is just walking it.
Your job is the evaluation half: implement Expr::eval(&self) -> i32
so it returns the numeric value of the whole tree. Recursion mirrors
the data perfectly. Num(v) is the base case (just return v);
Add(l, r) returns l.eval() + r.eval(); Mul(l, r) does the
same with *.
The match gives you a borrow of each inner Box<Expr>, and method
calls auto-deref through the box, so l.eval() works directly
without (*l).eval().
Useful from the standard library
Box::new(Expr::Num(2))builds a leaf you can put on either side of anAddorMul. The tests wire larger trees together for you.- Pattern matching on a reference:
match self { Expr::Add(l, r) => ... }bindsl: &Box<Expr>andr: &Box<Expr>. Method calls auto-deref, sol.eval()is fine;*lwould reach the innerExprif you ever needed it directly.- Recursion in Rust works exactly like recursion anywhere else. There's no tail-call optimization guarantee, but the test trees are tiny.
self. There are three arms: Expr::Num,
Expr::Add, and Expr::Mul.Num(v) returns *v (it's a borrow, so deref to get the i32).
Add(l, r) returns l.eval() + r.eval(). Mul(l, r) is the same
concept, but for *. Method calls auto-deref through the Box, so you
do not need to write (*l).eval().match self {
Expr::Num(v) => *v,
Expr::Add(l, r) => l.eval() + r.eval(),
Expr::Mul(l, r) => l.eval() * r.eval(),
}
Chapter 14 introduced trait objects (dyn Trait) and ended on a
puzzle: a dyn Trait doesn't have a known size at compile time
(different implementors are different sizes), so the compiler won't
let you put one directly in a Vec or return one from a function.
The fix is to put it behind a pointer, and the owned pointer is
Box<dyn Trait>.
let pipeline: Vec<Box<dyn Command>> = vec![
Box::new(Uppercase),
Box::new(Append { suffix: "!".to_string() }),
];
Every entry in the vector is one box, one pointer wide, all the same
size. Each box owns whatever concrete type it wraps. Dropping the
vector drops the boxes, which drops the inner values. This is the
exact same pattern you'll see in Chapter 19 as Box<dyn Error>:
"some value, I don't care which concrete type, just give me one
owned thing that implements the trait."
Calling a method on a Box<dyn Command> looks like calling it on
the concrete type: cmd.run(input). Under the hood, Rust does a
vtable lookup (the same trick C++ uses for virtual methods) to
pick the right implementation. The cost is one extra indirection
per call; the benefit is the heterogeneity above.
A tiny text-transformation pipeline. The trait is one method:
trait Command {
fn run(&self, input: &str) -> String;
}
Three commands are already implemented for you:
Uppercase upper-cases the input.Reverse reverses the input.Append { suffix } appends a configured suffix.The exercise is the orchestrator: apply_pipeline threads an input
string through every command in order, feeding each command's output
into the next command's input, and returns the final result. An empty
pipeline returns the input unchanged.
The reason this works is Box<dyn Command>. The pipeline can mix
Uppercase (a unit struct), Reverse (also a unit struct), and
Append { suffix: String } (carries a field) in the same Vec,
because each one is hidden behind the same fat pointer. A generic
Vec<C> where C: Command would only let you pick one concrete
command type per pipeline.
Useful from the standard library
- A
forloop over&[Box<dyn Command>]yields&Box<dyn Command>on each iteration. Method calls auto-deref through the box (and through the&), socmd.run(...)just works.- The pipeline is a fold: start with the input, and at each step the next command takes the previous output. A plain
let mut current = input.to_string();plus reassignment in the loop is the most readable thing.str::chars().rev().collect::<String>()is one way to reverse a string (it's already written in theReverseimpl below). Note that this reverses by Unicode scalar value, not by grapheme; the tests stick to ASCII so it doesn't matter here.
current string, replace
it with cmd.run(¤t) on each iteration, return it at the
end.current ends up as
the original input. That gets the empty-pipeline test for free.cmd.run(...) is
the only thing you call inside the loop.let mut current = input.to_string();
for cmd in commands {
current = cmd.run(¤t);
}
current
Or, if you've peeked at Chapter 16:
commands
.iter()
.fold(input.to_string(), |acc, cmd| cmd.run(&acc))
You boxed an integer and added it back out, defined a recursive
expression-tree type that only compiles because of Box, and
threaded an input string through a heterogeneous pipeline of
text-transformation commands via Box<dyn Command>.
What we learned
- A smart pointer is a type that owns heap data and runs cleanup automatically on drop. It's RAII: the destructor releases the resource, so you never write
freeordelete.Box<T>is the simplest smart pointer: one owner, one heap allocation, dropped when the box goes out of scope. C++ devs: this isstd::unique_ptr<T>.Box::new(value)constructs a box.*boxeddereferences it, and most method calls auto-deref so you rarely need to write*by hand.- Recursive enums need indirection.
Add(Expr, Expr)is infinitely sized;Add(Box<Expr>, Box<Expr>)is two pointers. The compiler can lay it out, and recursion mirrors the data exactly. The same concept underpins parsers, interpreters, and ASTs everywhere.Box<dyn Trait>is the owned form of a trait object. It lets you store mixed concrete types behind a single interface (aVec<Box<dyn Command>>of pipeline stages, all different structs, driven through one trait) and underpins theBox<dyn Error>pattern you'll meet in Chapter 19.- Dynamic dispatch through a trait object costs one vtable lookup per call. That's usually fine. Reach for generics (
fn f<T: Command>) when you want the compiler to monomorphize away the indirection.
Rc<T> ("reference counted") gives you multiple owners on a
single thread. The value is dropped when the last Rc goes away.
C++ analogue: std::shared_ptr<T> without the atomic overhead.Arc<T> is the same idea but safe to share across threads. It
shows up when concurrency does.RefCell<T> provides interior mutability: borrow checking moves
from compile time to runtime, so you can mutate through a shared
reference. It pairs with Rc for graph-shaped data and shows up
in some testing patterns. You can go a long way without needing
it.Chapter 16 puts iterators front and center, and you'll see how a
chain of .iter().fold(...) could have replaced the explicit loop
in apply_pipeline. Chapter 19 brings Box<dyn Error> and the
? operator together, which is the day-to-day payoff for
understanding Box<dyn Trait> here.