Congratulations, you've covered enough of Rust to write a small, useful program without any extra ceremony! Time for a short break to enjoy the view.
The next three steps build a tiny word-count library. The whole chapter is just
strings (chapter 2), for loops (chapter 3), and functions (chapter 4) applied
together.
This first version is the running example we'll keep refactoring throughout the course.
The standard library hands you split_whitespace on every &str.
For now, treat it as a "black box" that lets a for loop walk through
each word in a string:
for word in "hello world\nrust".split_whitespace() {
println!("{word}"); // hello, world, rust
}
It splits on any run of whitespace (spaces, tabs, newlines) and
skips empties, which is what you want for natural text. Don't
worry yet about what kind of thing .split_whitespace() returns
(it's an iterator, which we will cover later).
The same idea works at the character level via .chars():
for c in "hi".chars() {
println!("{c}"); // h, i
}
Both .split_whitespace() and .chars() are exactly the tools we need to count words and characters in a string.
Your first function: given a string of text, return how many words
it contains. Words are anything separated by whitespace, so
"hello world" has two words and " " has zero.
The recipe is the simplest possible: keep a counter, walk the text
with for ... in text.split_whitespace(), bump the counter on each
iteration, return it at the end.
Useful from the standard library
str::split_whitespacewalks through every whitespace-separated piece of a string. It handles tabs, newlines, and runs of consecutive spaces without any extra work on your part.
Same template as word_count, smaller granularity. Count every
character in the text, whitespace included. So "hi there" returns
8 (seven letters plus the space).
The recipe is identical to the previous step: start a counter at
0, walk text.chars() with a for loop, bump the counter once
per iteration.
The interesting wrinkle here is what chars() actually returns.
Rust strings are UTF-8 internally, so a single visible character
like é can take more than one byte. text.chars() walks the
characters, while text.len() returns the byte length — and
the two differ as soon as you leave plain ASCII. The unicode test
below pins this down.
Useful from the standard library
str::charswalks through everycharin a string, whitespace and all.str::lenreturns the byte length. Reach forchars().count()when you actually mean "how many characters?" — the two answers diverge the moment a non-ASCII character shows up.
Last one. Return the length (in characters) of the longest word in
the text. If the text has no words at all, return 0.
This brings together everything from the previous two steps: walk
the words with for word in text.split_whitespace(), measure each
one with word.chars().count(), and track the running maximum in a
let mut max = 0 variable.
The "track the running maximum" pattern shows up everywhere:
let mut max = 0;
for x in candidates {
if x > max {
max = x;
}
}
This is the manual version of "max by some property". Chapter 16 ("Iterators") shows it as a one-liner; doing it once by hand makes the shortcut feel like a reward rather than magic.
Useful from the standard library
str::charsandIterator::counttogether give you a correct character count:word.chars().count(). Usingword.len()would return the byte length, which differs from the character count for accented or non-Latin text.
Three tiny functions, all cut from the same template: a counter
variable, a for loop, and a return statement. That's enough to
build a real, useful tool — and it's the same shape you'll keep
reaching for as the chapters get bigger.
What we learned
text.split_whitespace()walks the words in a string for you. It handles any kind of whitespace and skips empties without ceremony.text.chars()walks every character in a string — whitespace and all. It's the right tool for "how many characters?".- "Track the running maximum" is the same shape every time:
let mut max = 0; for x in xs { if x > max { max = x; } }.word.chars().count()measures string length in characters, which is usually what you want.str::lenreturns bytes, and the two differ the moment you hit a non-ASCII character.
You'll meet split_whitespace and chars again in chapter 16
("Iterators"), where the three loops you just wrote collapse to:
fn word_count(text: &str) -> usize { text.split_whitespace().count() }
fn char_count(text: &str) -> usize { text.chars().count() }
fn longest_word(text: &str) -> usize { text.split_whitespace().map(|w| w.chars().count()).max().unwrap_or(0) }
Then in chapter 17 ("Word Frequencies") we'll go further and ask not just how many words a text contains, but which words appear and how often each one shows up.