Wires

I like retro games, so I’m trying to write a Zork machine in Rust. That said, I’m brand new to Rust, so I’ve decided to take smaller bites of the language until I’m a little bit more comfortable with how it works. This post is the story of one of these smaller projects.

Enter wires, a (much simplified) tool in Rust similar to the GNU utility Strings. Wires consumes a file, one byte at a time. If it finds a series of bytes at least 3 bytes long, where every byte is a printable ASCII character, it writes the series to stdout. If it can’t find or open the file, it exits 1, and if writing to stdout fails, it panics. Not super complicated, but it’s finished and does just that.

The most challenging aspect for me was decoupling the implementation of the actual parser (if you can call it that) from stdout. I didn’t want there to be explicit calls to println!() in the parser. In an early, bad design decision, I tried to make one function that would accept the output handle (in this case stdout) and a path, and then open the path and pass the byte stream from the file, and the output handle, to the parser. This type of indirection proved to be a little bit difficult in Rust. I need a function that accepted a mutable reference to stdout, and then passed it on still mutable. Rust doesn’t like mutably borrowing a mutable borrow (because it won’t be thread safe), so I couldn’t build it this way. Here’s some rusty pseudocode to show what I mean:

// this is the code that didn't work. 
fn main() {
    // get path from args
    let stdout = io::stdout();
    let mut handle = stdout.lock();
    read_and_parse_file(&path, &mut handle);
}

fn read_and_parse_file<W: Write>(path: &str, w: &mut W) {
    //try to open file
    // if it works:
    print_strings(&bytes, &mut w); // compiler error
}

fn print_strings<W: Write>(bytes: &[u8], w: &mut W) {
    //loop over bytes and print strings to w
    writeln!(w, "{}", some_string);
}

I could have had print_strings() return the strings instead of sending them to the output buffer as soon as they’re found, but that would make my code perform very poorly on large files – it would end up allocating a giant vector of little strings. Instead, I ended up having main() open the file and pass the string

fn main() {
    // get path from args and open the file. 
    // if that works:
    let mut reader = BufReader::new(file);
    let mut contents : Vec<u8> = Vec::new();
    let read_result = reader.read_to_end(&mut contents);
    // get handle to stdout:
    bytes_to_strings(&contents, &mut handle);
}

I think my object-oriented brain was thinking, “decouple it, pass a reference to an object that will use the reference to make more objects!” And I should have been thinking about memory, not objects.

Along the way, I learned a few handy things on the way. For example, this snippet:

use std::io::Write;

pub fn bytes_to_strings<W: Write>(bytes: &[u8], w:  &mut W) {
    // see https://github.com/willmurphyscode/wires/blob/master/src/wires.rs#L6
}

The above function signature accepts type W, that is, it accepts any struct that implements the trait Write (as defined in std::io). I think it’s pretty cool that a programming language that’s sufficiently close to the metal that we can comfortably re-implement old GNU tools is also smart enough to have implicit interfaces.

Also, I learned how to get a reference to stdout and use it in code, instead of just calling println!() when I need stdout:

    let stdout = io::stdout();
    let mut handle = stdout.lock();

It was a fun learning experience. I’m happy to take suggestions and pull requests.

Till next time, happy learning!

-Will

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s