Chapter 21

Parsing Structured Text and Generics

👋 Anyone can read and edit this exercise. Sign up to save your progress.

You have a problem. You decide to use generics. Now you have a Problem<T> where T: Clone + Send + Sync + 'static.

This chapter parses .env-style configuration files. Two new things show up:

  1. Splitting a string at the first occurrence of a separator.
  2. A generic function that works for any type the caller wants to parse into.

Splitting once

split returns an iterator of all parts. For "key=value" you usually want to split once and keep the rest of the line intact (in case the value itself contains the separator):

let line = "DATABASE_URL=postgres://user:pass@host/db";
match line.split_once('=') {
    Some((key, value)) => println!("{key} -> {value}"),
    None => println!("no '='"),
}

split_once returns Option<(&str, &str)>. The two halves are slices of the original string; no allocation.

Generic functions

Sometimes you want one function that works for many types. Here, "parse this string into whatever the caller asks for" is a perfect fit:

fn get<T>(env: &HashMap<String, String>, key: &str) -> Option<T>
where
    T: std::str::FromStr,
{
    env.get(key)?.parse().ok()
}

let port: Option<u16> = get(&env, "PORT");
let debug: Option<bool> = get(&env, "DEBUG");

<T> declares a type parameter. The where T: FromStr clause says "T must implement the FromStr trait", which is what makes .parse() work. .parse() returns Result<T, T::Err>; .ok() discards the error type and gives back Option<T>, which combines nicely with the ? on the preceding line.

Trim and skip

Real config files have empty lines, comments, and trailing whitespace. The usual handling chain is:

for line in content.lines() {
    let line = line.trim();
    if line.is_empty() || line.starts_with('#') {
        continue;
    }
    // ...parse the line
}

continue skips the rest of the current loop iteration and jumps to the next one. Its sibling, break, exits the loop entirely.

A note on raw strings: r#"..."#

The tests in this chapter use raw string literals to embed a multi-line .env snippet without escaping anything:

let content = r#"
HOST=localhost
PORT=5432
"#;

A raw string starts with r and zero or more #s, then a quote. It ends with the matching closing quote and #s. Inside, backslashes and quotes are literal: no escape sequences. Use more #s on each side if the content itself contains "#.

Parsing a single line

Before tackling whole files, get one line right. The .env format is KEY=value, but real-world files have surrounding whitespace too. KEY = value should be accepted and produce ("KEY", "value").

str::split_once('=') is the right tool: it gives back Option<(&str, &str)> containing the part before and after the first =. From there it's trim plus a couple of validity checks.

We also introduce a small ParseError enum that the rest of the chapter will reuse.

Useful from the standard library

  • str::split_once splits at the first match and returns Option<(&str, &str)>. The None case maps cleanly onto ParseError::InvalidFormat.
  • str::trim on each half drops the surrounding whitespace.
  • str::is_empty catches the =value and KEY= cases after trimming.
  • The Ok arm needs owned Strings, so finish with .to_string() on each half before wrapping in the tuple.
Exercise 1 of 4
Open in Web Editor

Results

    Compiler / runtime output
    
                

    Parsing a whole file

    With single-line parsing solved, the file-level function is mostly plumbing: iterate over content.lines(), skip blank lines and # comments, and accumulate the rest into a HashMap. Stop at the first malformed line and return an error. Strict parsing makes configuration bugs obvious instead of silently dropping values.

    Each step is self-contained, so the previous step's parse_env_line and ParseError are re-declared here with todo!() bodies. Re-implement them (or paste your earlier solution) so this step compiles on its own.

    Useful from the standard library

    • str::lines iterates over the lines of the file content, stripping \n and \r\n for you.
    • str::trim on each line lets you handle leading/trailing whitespace once.
    • str::starts_with with a '#' argument is the comment check. Combine with str::is_empty to skip blank lines.
    • HashMap::insert fills in each parsed pair. The ? after parse_env_line(line) short-circuits on the first malformed line.
    • A for loop reads more naturally here than an iterator chain because the body has both a continue skip and a ? early return.
    Exercise 2 of 4
    Open in Web Editor

    Results

      Compiler / runtime output
      
                  

      Typed lookup with generics

      Configuration values are stored as strings, but consumers want u16 ports, bool flags, and so on. Rather than write one helper per type, declare a generic function bounded by FromStr and let the caller pick the type at the call site with a turbofish or a type annotation.

      Inside the body, env.get(key)? short-circuits on a missing key and .parse().ok() collapses the parse Result into an Option. Don't try to ? the parse: T::Err is unconstrained here and would need an extra From bound.

      Useful from the standard library

      • HashMap::get returns Option<&String>. The ? propagates the missing-key case as None.
      • str::parse uses FromStr to produce Result<T, T::Err>. That's the trait the where clause is asking for.
      • Result::ok drops the error and yields Option<T>, exactly the function's return type.
      • The body fits on one line: env.get(key)?.parse().ok().
      Exercise 3 of 4
      Open in Web Editor

      Results

        Compiler / runtime output
        
                    

        Validating required variables

        Most apps need some configuration to be present at startup: a database URL, a port, an API key. This last helper takes a list of required keys and reports the first one that's missing.

        Iterator::find is a good fit: scan the slice, return the first key that isn't in the map, and turn that into an Err. If find returns None, every required key was present and the result is Ok(()).

        Useful from the standard library

        • <[T]>::iter on required yields &&str (a reference to each &str in the slice). The closure parameter |key| is therefore &&&str, which auto-derefs through method calls.
        • Iterator::find takes a predicate and returns the first matching item as Option<&&&str>. Combine with HashMap::contains_key to look for keys not in the map.
        • Option::map_or collapses the result into a Result in one call: find(...).map_or(Ok(()), |k| Err(k.to_string())).
        • A plain for k in required { if !env.contains_key(*k) { return Err(...) } } loop reads just as well; pick whichever you like.
        Exercise 4 of 4
        Open in Web Editor

        Results

          Compiler / runtime output
          
                      

          Wrapping up the env-file parser

          You parsed structured text line by line, layered file-level handling on top, exposed configuration through a generic typed lookup, and wrote a quick presence check for required keys.

          What we learned

          • split_once(delim) is the right tool for "key/value, split at the first separator". Returns Option<(&str, &str)> with no allocation.
          • lines() + trim() + starts_with('#') + is_empty() is the standard recipe for walking a config file. A for loop with continue and ? reads better than an iterator chain when the body has both kinds of control flow.
          • A small custom error enum (ParseError) lives nicely next to the parser. ? propagates it without ceremony as long as the function returns Result<_, ParseError>.
          • Generics let one function serve many caller types. The where T: FromStr bound is what makes .parse() work, and the caller pins T with a type annotation or a turbofish.
          • result.ok() is the easy way to drop an error and produce Option<T> when you genuinely don't care which kind of parse failure happened.
          • Raw string literals (r#"..."#) embed multi-line text without escaping. The number of #s on each side just has to be enough to avoid colliding with the body.
          • For real apps, the dotenvy crate reads .env files into the process environment; the parser you just wrote is a stripped-down version of the same idea.
          Next chapter 22State Machines and Stateful Parsing