One of the cool things about being involved with the development of Rust over the last couple of years has been watching the community grow along with the language. When I came to Mozilla as an intern in March 2011 and started contributing to Rust, the Rust “community” consisted of perhaps twenty people who frequented the #rust IRC channel, most of whom were Mozilla employees. There weren’t any third-party resources for helping people get started with Rust back then.

Two years later, there are 170 users in #rust on a typical Saturday afternoon like this one, and we’re starting to have a healthy ecosystem of volunteer-created Rust Stuff, such as this book by Steve Klabnik.1 In his book, after getting some basics out of the way, Steve shows how to code up a solution to the FizzBuzz problem in Rust. FizzBuzz is a simple programming problem that is nevertheless infamous for stumping people in job interviews: Write a program that prints the numbers from 1 to 100. But for multiples of three, print “Fizz” instead of the number, and for multiples of five, print “Buzz”. For numbers that are multiples of both three and five, print “FizzBuzz”. People like to use it as a toy problem for getting started with a new programming language, because it covers iteration, conditionals, and printing output.

But FizzBuzz also struck me as the perfect way to introduce some of Rust’s pattern-matching features. Pattern matching is everywhere in Rust code, and Steve does get to it later on in the book, but I couldn’t help myself – this problem was just crying out for it. So, here’s an introduction to pattern matching in Rust, by way of FizzBuzz.

The original code

Here’s the version of FizzBuzz from Steve’s book, minus the tests:

fn is_three(num: int) -> bool {
  num % 3 == 0
}

fn is_five(num: int) -> bool {
  num % 5 == 0
}

fn is_fifteen(num: int) -> bool {
  num % 15 == 0
}

fn main() {
  for num in range(1, 101) {
    println(
      if is_fifteen(num) { ~"FizzBuzz" }
      else if is_three(num) { ~"Fizz" }
      else if is_five(num) { ~"Buzz" }
      else { num.to_str() }
    );
  }
}

This code works fine, but let’s see if we can rewrite the main function to use pattern matching.

First version: transliterate into match

First, let’s try writing something that looks a lot like the original, but uses Rust’s match construct.

fn main() {
    for num in range(1, 101) {
        println(
            match num {
                _ if is_fifteen(num) => ~"FizzBuzz",
                _ if is_three(num) => ~"Fizz",
                _ if is_five(num) => ~"Buzz",
                _ => num.to_str()
            }
        );
    }
}

In Rust, the match language construct takes an expression, in this case num, and a number of arms, each beginning with a pattern. In this example, the first arm (on line 5) begins with _, which is the “wildcard” pattern – that is, it matches everything. But then, the match is narrowed down by is_fifteen(num), the value of which determines whether that arm is actually taken. The if is_fifteen(num) is what’s known as a pattern guard. If is_fifteen(num) evaluates to true, we take that arm.

Looking at the next match arm on line 6, we see that its pattern is also _, the wildcard pattern. In fact, all of the patterns in this example are _, and all of the work of matching is being done by pattern guards.

Second version: match against tuples of bools

The code from our first attempt actually works fine, but it’s a rather weird (and error-prone) way to use pattern matching. Pushing all of the work into the pattern guards makes the code a bit confusing, and it doesn’t really let us take advantage of the convenience and power of pattern matching. So, let’s try something else:

fn main() {
    for num in range(1, 101) {
        println(
            match (is_three(num), is_five(num)) {
                (true, true) => ~"FizzBuzz",
                (true, false) => ~"Fizz",
                (false, true) => ~"Buzz",
                _ => num.to_str()
            }
        );
    }
}

Here, we’re matching against a tuple of two values: the value of is_three(num) and the value of is_five(num). If both of them are true, then we print FizzBuzz. Notice that we don’t need the is_fifteen function anymore, because the tuple (true, true) covers exactly that case! Pattern matching has already helped us.

This code works fine as written, but there’s another tweak we can make. Since the only possible match for the wildcard pattern in the final arm is (false, false) (since every other case has already been handled), we might as well change it to that:

fn main() {
    for num in range(1, 101) {
        println(
            match (is_three(num), is_five(num)) {
                (true, true) => ~"FizzBuzz",
                (true, false) => ~"Fizz",
                (false, true) => ~"Buzz",
                (false, false) => num.to_str()
            }
        );
    }
}

Doing so leads to an unexpected win: now we can reorder the match arms any way we want. In the version that used if, we would’ve been in trouble if the “FizzBuzz” case hadn’t been first or if the final, catch-all case hadn’t been last, but now it’s fine to order them any which way. We can swap the “FizzBuzz” case and the catch-all cases around, for instance:

fn main() {
    for num in range(1, 101) {
        println(
            match (is_three(num), is_five(num)) {
                (false, false) => num.to_str(),
                (true, false) => ~"Fizz",
                (false, true) => ~"Buzz",
                (true, true) => ~"FizzBuzz"
            }
        );
    }
}

Moreover, if we forget any of the arms, the compiler tells us that we’ve made a mistake. Commenting out the “Fizz” case, for instance, results in an error, telling us that the match patterns didn’t cover all possible cases.2 This is called exhaustiveness checking.

$ rustc fizzbuzz.rs 
fizzbuzz.rs:18:12: 23:13 error: non-exhaustive patterns: false not covered
fizzbuzz.rs:18             match (is_three(num), is_five(num)) {
fizzbuzz.rs:19                 (false, false) => int::str(num),
fizzbuzz.rs:20                 // (true, false) => ~"Fizz",
fizzbuzz.rs:21                 (false, true) => ~"Buzz",
fizzbuzz.rs:22                 (true, true) => ~"FizzBuzz"
fizzbuzz.rs:23             }
error: aborting due to previous error

Exhaustiveness checking is conservative, meaning that if the compiler can’t determine whether a match is exhaustive or not, it will complain of non-exhaustiveness even if the match really is exhaustive. To illustrate, let’s try writing this code slightly differently.

Third version: match against tuples of ints

Instead of using the is_three and is_five functions, let’s try matching directly against the values of num % 3 and num % 5.

fn main() {
    for num in range(1, 101) {
        println(
            match (num % 3, num % 5) {
                (0, 0) => ~"FizzBuzz",
                (0, n) if n != 0 => ~"Fizz",
                (m, 0) if m != 0 => ~"Buzz",
                (m, n) if m != 0 && n != 0 => num.to_str()
            }
        );
    }
}

Now we’re matching against a tuple of ints instead of a tuple of bools, and we’re using pattern guards to make the arms of the match non-overlapping and therefore order-independent. Unfortunately, this code won’t compile. As it turns out, the match is exhaustive, but the compiler can’t determine that it is. We get the following error:

$ rustc fizzbuzz.rs 
fizzbuzz.rs:18:12: 23:13 error: non-exhaustive patterns
fizzbuzz.rs:18             match (num % 3, num % 5) {
fizzbuzz.rs:19                 (0, 0) => ~"FizzBuzz",
fizzbuzz.rs:20                 (0, n) if n != 0 => ~"Fizz",
fizzbuzz.rs:21                 (m, 0) if m != 0 => ~"Buzz",
fizzbuzz.rs:22                 (m, n) if m != 0 && n != 0 => int::str(num)
fizzbuzz.rs:23             }
error: aborting due to previous error

In order to confirm that the match is exhaustive, the compiler would have to be able to prove that, for instance, the cases excluded by the if m != 0 && n != 0 pattern guard on the last arm are covered by the other arms of the match. It can’t do that, so it assumes that the match is non-exhaustive and raises an error.

But all isn’t lost: we can just get rid of the pattern guards. The result is a match that the compiler knows is exhaustive. The only drawback is that now we can no longer reorder the match arms at will. (Also, we would get an unused variable warning if we continued to use m and n after getting rid of the pattern guards that use them, so we’ll just change them to underscores.)

fn main() {
    for num in range(1, 101) {
        println(
            match (num % 3, num % 5) {
                (0, 0) => ~"FizzBuzz",      // must come first
                (0, _) => ~"Fizz",
                (_, 0) => ~"Buzz",
                (_, _) => num.to_str()      // must come last
            }
        );
    }
}

Code like this turns up all the time in Rust programs. It’s concise, convenient and easy to write. But there’s still that nagging issue of certain clauses having to come first and last, because the match patterns overlap with each other to some extent.

Is there a way to get back the order-independence of the version that used Booleans, but without relying on Booleans?

Why not just use Booleans?

First of all, why might we not want to use Booleans? The answer is that sometimes we want more information than a Boolean value is capable of carrying. Consider again our version of FizzBuzz that matches against a tuple of bools:

fn main() {
    for num in range(1, 101) {
        println(
            match (is_three(num), is_five(num)) {
                (false, false) => num.to_str(),
                (true, false) => ~"Fizz",
                (false, true) => ~"Buzz",
                (true, true) => ~"FizzBuzz"
            }
        );
    }
}

It works just fine for our purposes so far, of course. But suppose that, in addition to the usual FizzBuzz specification, we also wanted to print “Zot” every time we encountered a number that had a remainder of 2 when divided by 3, and a remainder of 1 when divided by 5. (Such numbers include 11, 26, 41… .)

Admittedly, this is a contrived problem. But if we did want to handle it, we’d have to overhaul our code, because the results of is_three and is_five don’t give us enough information. They tell us whether their arguments have a remainder when divided by three and five, respectively, but they don’t tell us what that remainder is.3

In that sense, the version that matches against tuples of ints instead of bools is much more flexible. We can handle the “Zot” case with a one-line addition to it:

fn main() {
    for num in range(1, 101) {
        println(
            match (num % 3, num % 5) {
                (2, 1) => ~"Zot",
                (0, 0) => ~"FizzBuzz",  // must come first (or second, if "Zot" is first)
                (0, _) => ~"Fizz",
                (_, 0) => ~"Buzz",
                (_, _) => num.to_str()  // must come last
            }
        );
    }
}

The “Zot” case can be inserted almost anywhere – including before the “FizzBuzz” case, which previously had to be first. The catch-all (_, _) pattern still has to come last, though.

Having it both ways

Is there a way to get the order-independence that we get when we match against bools, but also the flexibility we get when we match against ints? Yes, there is!

Since neither int nor bool are quite what we want, we’ll define our own type, which we’ll call Rem for remainder. Rem is the type of values that are the result of the num % 3 and num % 5 operations. We know that such a value will be one of 0, 1, 2, 3, or 4 (or one of just 0, 1, or 2 in the case of num % 3). For regular FizzBuzz, all we care about is whether the value is 0 or not; for the “Zot” situation, on the other hand, we need more information. So, we define the Rem type accordingly:

enum Rem {
    zero,
    other(NonZeroRem)
}

enum NonZeroRem {
    one, two, three, four
}

The Rem type distinguishes two variants of values: those that are zero, and those that are something else. In the “something else” case, we also carry around the information about what the non-zero value is, just in case it becomes important.

Informally, these types let us treat a value like a bool if we squint at it one way and use the zero/other distinction, but we can also treat it like an int if we feel like doing so (albeit one in a finite range, and a rather narrow one at that, but it’s all we need for this problem). We also could have defined separate types for values that are the result of num % 3 and values that are the result of num % 5, if we had wanted to be especially pedantic.

Given an int in the range 0-4, it’s easy to determine the corresponding value of type Rem. The int_to_rem function takes an int and returns a Rem representation of it.

fn int_to_Rem(num: int) -> Rem {
    match num {
        0 => zero,
        1 => other(one),
        2 => other(two),
        3 => other(three),
        4 => other(four),
        _ => fail!("oops")
    }
}

The last case raises an error because if we ever pass something to int_to_Rem that’s not in the range 0-4, we’ve done something wrong – there’s no Rem representation for such a value.

Final version: match against tuples of Rems

Now, we can write our code this way:

fn main() {
    for num in range(1, 101) {
        println(
            match (int_to_Rem(num % 3), int_to_Rem(num % 5)) {
                (zero, zero) => ~"FizzBuzz",
                (zero, other(_)) => ~"Fizz",
                (other(_), zero) => ~"Buzz",
                (other(_), other(_)) => num.to_str()
            }
        );
    }
}

Now we’re matching against a tuple of Rems, but we’ve retained the order-independence that we had when we were matching against a tuple of bools. Because the cases are all non-overlapping, we can put them in any order we want – say, like this:

fn main() {
    for num in range(1, 101) {
        println(
            match (int_to_Rem(num % 3), int_to_Rem(num % 5)) {
                (other(_), other(_)) => num.to_str(),
                (zero, other(_)) => ~"Fizz",
                (other(_), zero) => ~"Buzz",
                (zero, zero) => ~"FizzBuzz"
            }
        );
    }
}

And because we’re working with Rems instead of bools, we can easily modify the code to handle the “Zot” case:

fn main() {
    for num in range(1, 101) {
        println(
            match (int_to_Rem(num % 3), int_to_Rem(num % 5)) {
                (other(two), other(one)) => ~"Zot",
                (other(_), other(_)) => num.to_str(),
                (zero, other(_)) => ~"Fizz",
                (other(_), zero) => ~"Buzz",
                (zero, zero) => ~"FizzBuzz"
            }
        );
    }
}

To be fair, we had to introduce a little bit of order-dependence here, because the pattern for the “Zot” case overlaps with the (other(_), other(_)) pattern, so the former has to come before the latter. But there are no ordering constraints other than that. Specifically, there’s no particular arm that has to come last, as we had when we were working with ints.

Finally, if we really wanted order-independence at all costs, we could always get it by spelling out cases for each of the NonZeroRem variants. This is possible because there are a finite number of values of NonZeroRem type, whereas it wouldn’t have been possible if we were still working with ints. It’s when working with enum types like Rem and NonZeroRem that Rust’s pattern-matching features are particularly handy, and this post really only scratches the surface of what you can do.

So, go forth and pattern-match, and let me know how it goes!

This post is adapted from a gist I wrote back in December. The code has been tested with Rust 0.5, which is the most recent release as of this writing.

Update (March 3, 2013): I just added two of the versions of FizzBuzz from this post to the Rust section of the Rosetta Code FizzBuzz page. (There were a couple of Rust FizzBuzzes there already, but they didn’t use pattern matching.)

Update (October 14, 2013): I’ve updated this post, and the accompanying gist, to work against Rust 0.8. The syntax has gotten a lot nicer since 0.5!

  1. The book is called Rust for Rubyists, but you don’t have to know much of anything about Ruby to enjoy it; I don’t, and I did. 

  2. Actually, the error message should say (true, false) not covered rather than false not covered. I filed a bug for this issue a couple of months ago, and it’s still open. Any takers? 

  3. I think this lack of information is an example of what Bob Harper says Dan Licata refers to as “Boolean blindness”

Comments