A Calculator REPL in Rust

This will be a short series on building a calculator REPL in Rust. This is a program that can evaluate expressions like (+ 3 (2 * 4)) (and get 11 in this case).

The source is all on GitHub. Also, to help people follow along, I’ve made branches that represent different steps of the process. All the branches are named like step-n-do-foo. My first code is on step-1-node-structs is where I added the tokens (which I called nodes for a minute).

The code has two main components: the tokenizer and the interpreter.

Tokenizer

The tokenizer takes an input string and returns tokens. Tokens are basically types internal to the interpreter, that mean something to it. Transforming an input string into a series of tokens is the first step towards interpreting it. My token class looks like this:

What we’re saying is that there are four types of things that mean something in this program: left parens, right parens, operators, and operands. We need left and right parens to delimit expressions, and we need operators and operands to operate on. Operators themselves come in four types: the four arithmetic operations. Operands are defined as an isize, which is rust for “signed integer of the native pointer length for the processor we’re on.” This is an integer64 on my laptop.

We can think of the tokenizer as a function that takes a string (the input program) and turns it into an array of tokens that mean something to the rest of the program. For this step, I’m using the crate nom, which lets me write functions like “pull digits from the front of the string until you hit something thats not a digit, then parse that substring as an integer,” but without a bunch of nasty for loops.

Interpeter

The other part of this program is the interpreter. The interpeter takes an array of tokens and evaluates what they mean. It is a shift-reduce parser.

A shift-reduce parser has two operations: shift, which takes one more token from the beginning of the array of unscanned tokens, and reduce, which simplifies some number of tokens and replaces them with the result of the simplification. For example, let’s say we want to compute (+ (* 2 3) (* 2 1)). The execution would look like this:

  1. Shift – push the ( onto the stack. Remaining: + (* 2 3) (* 2 1))
  2. Shift – push the + operator onto the stack. Remaining: (* 2 3) (* 2 1))
  3. Shift – push the ( onto the stack. Remaining: * 2 3) (* 2 1))
  4. Shift – push the * operator onto the stack: Remaining: 2 3) (* 2 1))
  5. Shift – push the 2. Remaining: 3) (* 2 1))
  6. Shift – push the 3. Remaining: ) (* 2 1))
  7. Shift – push the ). Remaining: (* 2 1))
  8. Reduce! The top of the stack is now (* 2 3), which can simplify to six. Pop the stack until we see a left paren, then simplify that expression. Push 6 onto the stack.
  9. Shift – push the (. Remaining: * 2 1))
  10. Shift – push the *. Remaining: 2 1))
  11. Shift – push the 2. Remaining: 1))
  12. Shift – push the 1. Remaining: )).
  13. Shift – push the ). Remaining ).
  14. Reduce! The top of the stack is now (* 2 1) which reduces to 2. Push 2.
  15. Shift – push the ). Nothing remaining.
  16. Reduce! The top of the stack is now (+ 6 2) which reduces to 8. Return 8.

So that’s what we’re building this month! A tokenizer and a shift reduce parser. Next time, we’ll take a look at how the tokenizer actually works.

Till then, happy learning!

-Will

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s