This is an implementation of rainfall in Rust. The design factors input, computation, and output into separate functions. Instead of reading from stdin and writing to stdout, the IO functions are parameterized to support unit testing:
-
Function
read_measurements
can read from any type that implements traitRead
(e.g.,Stdin
,File
, or&[u8]
). -
Function
write_output
can write to any type that implements traitWrite
(e.g.,Stdout
,File
, or&mut Vec<u8>
).
Additionally, instead of panicking on errors, the IO functions report
errors by returning std::io::Result
s.
You may find read_measurements
difficult to read, as it’s written in
a functional style using Iterator
tranformers. So first let's
consider two simpler versions of the function.
Function read_measurements0
also uses iterator tranformers, but
because it punts on error handling, it may be easier to understand. In
particular, it:
- creates an iterator over the lines of the input,
- checks for errors and panics if it encounters one,
- trucates the stream if it sees the
termination code
"999"
(where|…| …
is Rust syntax for a lambda, with the parameters between the pipes and the body after), - attempts to parse each line into an
f64
, filtering parsing failures out of the stream, - filters out negative readings, and finally
- collects the remaining readings into an
Vec<f64>
.
Step 6 may seem kind of magical, because the Iterator::collect
method can accumulate the values of an iterator into a variety of
different collection types. For example,
the item impl FromIterator<char> for String
means that an iterator
over characters can be collected into a string, whereas the item
impl FromIterator<String> for String
means that an iterator over
strings can also be collected into a string. The impl
used by this
step 6 is impl<T> FromIterator for Vec<T>
.
Next, function read_measurements1
propagates errors to its caller
rather than panicking, but rather than using the functional/iterator
style, it’s written in an imperative style using a mutable vector
variable, a for
loop, a break
statement, and several if
s. This is
close to how you’d write it in C++. Note that let line = line?;
checks
whether line
(a Result
) is an error or okay. If it’s an error then
the function returns immediately, propagating the error; but if line
is okay then ?
extracts the String
from it and binds line
(a
different variable that happens to have the same name) to that.
The imperative implementation read_measurements1
is correct, and you
don’t need to be able to write fancy iterator transformer chains to
write excellent Rust. You should, though, at least be able to read both
ways of expressing this kind of algorithm. So let’s return to
read_measurements
and read through it step by step. It:
- creates an iterator over the lines of the input,
- trucates the stream if it sees the termination code
"999"
, - attempts to parse each line into an
f64
, filtering it out of the stream when parsing fails, - filters out negative readings, and finally
- collects the remaining readings into an
std::io::Result<Vec<f64>>
.
This time, step 5 is particularly interesting. As in the other
implementations, the stream of lines returned by BufRead::lines
is
an iterator not over String
s but over std::io::Result<String>
s; but
unlike in read_measurements0
, we don’t bail out on errors. Instead,
steps 2–4 all have to deal with the possibility of errors, which is why
steps 2 and 3 use Result::map
to work on Ok
results while passing
Err
results through unchanged, and why step 4 uses
Result::unwrap_or
to map errors to a number that the filter
predicate accepts.
Thus, coming out of step 4 and into step 5 is a stream of
std::io::Result<f64>
s, and Iterator::collect
must turn an iterator
over std::io::Result<f64>
s turn into an std::io::Result<Vec<f64>>
.
What does this mean? If every std::io::Result<f64>
in the stream is
Ok
then it returns Ok
of a vector containing all the f64
s, but if
it ever encounters Err
of some std::io::Error
e
then it returns
Err(e)
immediately as well. Here is the impl
logic:
impl<T, E, C> FromIterator<Result<T, E>> for Result<C, E>
where
C: FromIterator<T>
That is:
- For any types
T
(the element),E
(the error), andC
(the container), - if an iterator over
T
s can be collected into aC
, - then an iterator over
Result<T, E>
s can be collected into aResult<C, E>
.
Noting that std::io::Result<A>
is a synonym for
Result<A, std::io::Error>
, we can see that step 5 uses the
aforementioned impl
with T = f64
, E = std::io::Error
, and C = Vec<f64>
.
Making our IO functions generic over the Read
and Write
traits
means that it’s easy to test read_measurements
and write_output
from
within Rust’s built-in unit testing framework.
In fact occasionally we might write our whole program as a function from input to output. (Don’t do this on your homework, because all your homework programs are intended to be interactive.)
In any case, parameterizing our functions this way lets us write assertions that:
-
read_measurements
parses a particular input (given as a string literal) into a particular internal representation; -
write_output
unparses a particular internal representation into a particular output (given as a string literal); and -
transform
transforms a particular input into a particular output (both given as string literals).
When writing tests that require some special setup or comparison, it’s
not very nice to repeat that code. It’s much nicer to abstract the
boilerplate into a function like assert_read
, assert_write
, or
assert_transform
, and then express each of your test cases in terms
of your new assertion. Read assert_transform
carefully to see how
it:
- creates an empty vector of bytes to use as a mock
Write
, - views a string as a byte array to use as a mock
Read
, - attempts to convert the
Vec<u8>
output into a UTF-8 string, failing the test if it can’t, and finally - asserts that the output was what we expected.