Chapter 15

Smart Pointers

👋 Anyone can read and edit this exercise. Sign up to save your progress.

A pointer that cleans up after itself walks into a scope. Nothing leaks.

A smart pointer is a type that owns a value on the heap and runs cleanup automatically when it goes out of scope. The "smart" part is the cleanup: there's no free, no delete, no dispose(). When the owning binding drops, the destructor runs and the memory comes back.

If you've used C++, you already know the idea. Box<T> is Rust's std::unique_ptr<T>: one owner, freed on drop. Rc<T> is std::shared_ptr<T>: shared ownership via a reference count. Both patterns are an instance of RAII (Resource Acquisition Is Initialization): tie the lifetime of a resource to the lifetime of a stack value, and let scope exit do the cleanup.

If your background is Java, Python, or JavaScript, the comparison is trickier. A String in Java is a reference to a heap object that the garbage collector reclaims when no one is looking at it anymore. Those references are not smart pointers in the Rust sense: there is no single owner, and you don't know when (or whether) cleanup happens. Rust's smart pointers give you the heap allocation without the GC, because ownership tells the compiler exactly when to drop.

What's in this chapter

  1. Box<T>: heap allocation with a single owner. The simplest smart pointer. You give it a value, it puts it on the heap, and it frees it when the box goes out of scope.
  2. Recursive types. Some types are impossible to write without indirection. A linked-list node that contains another node would be infinitely sized; Box gives the compiler a fixed-size handle to put in the struct.
  3. Box<dyn Trait>. Picking up where Chapter 14 left off. A trait object like dyn Shape doesn't have a known size, so it has to live behind a pointer. Box<dyn Shape> is the owned form, and it's what lets you store a Vec of mixed concrete types that all implement the same trait. The ? operator chapter (Chapter 17) uses the same trick with Box<dyn Error>.

Two more smart pointers worth knowing about

Real codebases reach for two other smart pointers often enough that they deserve a mention here, even though the exercises in this chapter focus on Box.

Rc<T>: shared ownership, single-threaded

Box<T> has exactly one owner. Sometimes you genuinely need several parts of a program to share ownership of the same value, and you can't predict which one will be the last to let go. Rc<T> ("reference counted") tracks the number of owners and drops the value when the count hits zero.

use std::rc::Rc;

let a = Rc::new(String::from("shared"));
let b = Rc::clone(&a);   // bumps the count to 2, no deep copy
let c = Rc::clone(&a);   // count is now 3
// When a, b, and c all go out of scope, the String is dropped.

Rc is single-threaded. The multi-threaded equivalent is Arc<T> ("atomically reference counted"), which you'll meet when concurrency shows up. C++ devs: Rc is shared_ptr without the atomic overhead, Arc is shared_ptr with it.

RefCell<T>: interior mutability

The borrow checker normally enforces "one mutable reference or many immutable ones" at compile time. RefCell<T> moves that check to runtime: you can hand out an &RefCell<T>, and someone holding it can still mutate the inner value by calling .borrow_mut(). If the borrowing rules are violated, the program panics instead of failing to compile.

If you're coming from Java, this is close to a field with a private setter: outside code holds an immutable handle to the object, but the object can still mutate itself. You will not need RefCell for a long time. It pairs with Rc to build graph-shaped data, and it shows up in some testing patterns. Mentioned here so you recognize the name when you see it.

Putting a value on the heap with `Box`

Box::new(value) allocates space on the heap, moves value into it, and hands you back a Box<T> that owns that allocation. When the box goes out of scope, Rust drops the inner value and frees the memory. No free, no leaks.

A Box<T> acts like the value it holds. The * operator dereferences it, and most method calls work through automatic dereferencing without you writing * at all.

let boxed: Box<i32> = Box::new(7);
let n: i32 = *boxed;            // explicit deref
assert_eq!(n + 1, 8);

Why bother boxing a tiny i32? You usually wouldn't. Box earns its keep when:

For this exercise we keep it simple: take two boxed integers, add them, return the sum.

Useful from the standard library

  • Box::new is the only constructor you need here.
  • Deref with *a and *b, or rely on auto-deref and just write *a + *b. Both work because i32 is Copy, so reading through the box doesn't move anything out.
Exercise 1 of 3
Open in Web Editor

Results

    Compiler / runtime output
    
                
    Stuck? Show a hint No spoilers, just a nudge
    1. *a reads the i32 out of the box.
    2. i32 is Copy, so reading through the box doesn't move anything out. You can use *a and *b as many times as you want.
    3. *a + *b
      

    A recursive type that needs `Box`

    Try to imagine this enum without the Box:

    enum Expr {
        Num(i32),
        Add(Expr, Expr),   // a node holds two whole sub-expressions inline
        Mul(Expr, Expr),
    }
    

    The compiler has to decide how many bytes one Expr occupies. Add is at least two Exprs, each of which is at least two Exprs, which is... you see the problem. The size is infinite, and the compiler refuses to lay out the type.

    Box<Expr> fixes it. A Box is always one pointer wide, regardless of what it points to, so the compiler now knows that Add is two pointers' worth of memory. The actual sub-expressions live on the heap, reached through those pointers.

    enum Expr {
        Num(i32),
        Add(Box<Expr>, Box<Expr>),
        Mul(Box<Expr>, Box<Expr>),
    }
    

    This is the same trick C uses with struct node { struct node *l; struct node *r; } and that Java/C# get for free because every object is already a reference. Rust just wants you to ask for the indirection explicitly.

    What you're building

    Expr is a tiny expression tree: a value is either a literal number, the sum of two sub-expressions, or the product of two sub-expressions. This is how every interpreter and every calculator represents code internally. Parsing text like "(1 + 2) * 4" produces an Expr tree; evaluating that tree is just walking it.

    Your job is the evaluation half: implement Expr::eval(&self) -> i32 so it returns the numeric value of the whole tree. Recursion mirrors the data perfectly. Num(v) is the base case (just return v); Add(l, r) returns l.eval() + r.eval(); Mul(l, r) does the same with *.

    The match gives you a borrow of each inner Box<Expr>, and method calls auto-deref through the box, so l.eval() works directly without (*l).eval().

    Useful from the standard library

    • Box::new(Expr::Num(2)) builds a leaf you can put on either side of an Add or Mul. The tests wire larger trees together for you.
    • Pattern matching on a reference: match self { Expr::Add(l, r) => ... } binds l: &Box<Expr> and r: &Box<Expr>. Method calls auto-deref, so l.eval() is fine; *l would reach the inner Expr if you ever needed it directly.
    • Recursion in Rust works exactly like recursion anywhere else. There's no tail-call optimization guarantee, but the test trees are tiny.
    Exercise 2 of 3
    Open in Web Editor

    Results

      Compiler / runtime output
      
                  
      Stuck? Show a hint No spoilers, just a nudge
      1. Pattern-match on self. There are three arms: Expr::Num, Expr::Add, and Expr::Mul.
      2. Num(v) returns *v (it's a borrow, so deref to get the i32). Add(l, r) returns l.eval() + r.eval(). Mul(l, r) is the same concept, but for *. Method calls auto-deref through the Box, so you do not need to write (*l).eval().
      3. match self {
            Expr::Num(v) => *v,
            Expr::Add(l, r) => l.eval() + r.eval(),
            Expr::Mul(l, r) => l.eval() * r.eval(),
        }
        

      Mixed types behind one trait: `Box<dyn Trait>`

      Chapter 14 introduced trait objects (dyn Trait) and ended on a puzzle: a dyn Trait doesn't have a known size at compile time (different implementors are different sizes), so the compiler won't let you put one directly in a Vec or return one from a function. The fix is to put it behind a pointer, and the owned pointer is Box<dyn Trait>.

      let pipeline: Vec<Box<dyn Command>> = vec![
          Box::new(Uppercase),
          Box::new(Append { suffix: "!".to_string() }),
      ];
      

      Every entry in the vector is one box, one pointer wide, all the same size. Each box owns whatever concrete type it wraps. Dropping the vector drops the boxes, which drops the inner values. This is the exact same pattern you'll see in Chapter 19 as Box<dyn Error>: "some value, I don't care which concrete type, just give me one owned thing that implements the trait."

      Calling a method on a Box<dyn Command> looks like calling it on the concrete type: cmd.run(input). Under the hood, Rust does a vtable lookup (the same trick C++ uses for virtual methods) to pick the right implementation. The cost is one extra indirection per call; the benefit is the heterogeneity above.

      What you're building

      A tiny text-transformation pipeline. The trait is one method:

      trait Command {
          fn run(&self, input: &str) -> String;
      }
      

      Three commands are already implemented for you:

      The exercise is the orchestrator: apply_pipeline threads an input string through every command in order, feeding each command's output into the next command's input, and returns the final result. An empty pipeline returns the input unchanged.

      The reason this works is Box<dyn Command>. The pipeline can mix Uppercase (a unit struct), Reverse (also a unit struct), and Append { suffix: String } (carries a field) in the same Vec, because each one is hidden behind the same fat pointer. A generic Vec<C> where C: Command would only let you pick one concrete command type per pipeline.

      Useful from the standard library

      • A for loop over &[Box<dyn Command>] yields &Box<dyn Command> on each iteration. Method calls auto-deref through the box (and through the &), so cmd.run(...) just works.
      • The pipeline is a fold: start with the input, and at each step the next command takes the previous output. A plain let mut current = input.to_string(); plus reassignment in the loop is the most readable thing.
      • str::chars().rev().collect::<String>() is one way to reverse a string (it's already written in the Reverse impl below). Note that this reverses by Unicode scalar value, not by grapheme; the tests stick to ASCII so it doesn't matter here.
      Exercise 3 of 3
      Open in Web Editor

      Results

        Compiler / runtime output
        
                    
        Stuck? Show a hint No spoilers, just a nudge
        1. The pipeline is a fold: keep a running current string, replace it with cmd.run(&current) on each iteration, return it at the end.
        2. An empty pipeline never enters the loop, so current ends up as the original input. That gets the empty-pipeline test for free.
        3. Method calls go through the box automatically. cmd.run(...) is the only thing you call inside the loop.
        4. let mut current = input.to_string();
          for cmd in commands {
              current = cmd.run(&current);
          }
          current
          
          Or, if you've peeked at Chapter 16:
          commands
              .iter()
              .fold(input.to_string(), |acc, cmd| cmd.run(&acc))
          

        Wrapping up smart pointers

        You boxed an integer and added it back out, defined a recursive expression-tree type that only compiles because of Box, and threaded an input string through a heterogeneous pipeline of text-transformation commands via Box<dyn Command>.

        What we learned

        • A smart pointer is a type that owns heap data and runs cleanup automatically on drop. It's RAII: the destructor releases the resource, so you never write free or delete.
        • Box<T> is the simplest smart pointer: one owner, one heap allocation, dropped when the box goes out of scope. C++ devs: this is std::unique_ptr<T>.
        • Box::new(value) constructs a box. *boxed dereferences it, and most method calls auto-deref so you rarely need to write * by hand.
        • Recursive enums need indirection. Add(Expr, Expr) is infinitely sized; Add(Box<Expr>, Box<Expr>) is two pointers. The compiler can lay it out, and recursion mirrors the data exactly. The same concept underpins parsers, interpreters, and ASTs everywhere.
        • Box<dyn Trait> is the owned form of a trait object. It lets you store mixed concrete types behind a single interface (a Vec<Box<dyn Command>> of pipeline stages, all different structs, driven through one trait) and underpins the Box<dyn Error> pattern you'll meet in Chapter 19.
        • Dynamic dispatch through a trait object costs one vtable lookup per call. That's usually fine. Reach for generics (fn f<T: Command>) when you want the compiler to monomorphize away the indirection.

        Other smart pointers, briefly

        Where this goes next

        Chapter 16 puts iterators front and center, and you'll see how a chain of .iter().fold(...) could have replaced the explicit loop in apply_pipeline. Chapter 19 brings Box<dyn Error> and the ? operator together, which is the day-to-day payoff for understanding Box<dyn Trait> here.

        Next chapter 16Iterators