This will be a short series on building a calculator REPL in Rust. This is a program that can evaluate expressions like (+ 3 (2 * 4))
(and get 11
in this case).
The source is all on GitHub. Also, to help people follow along, I’ve made branches that represent different steps of the process. All the branches are named like step-n-do-foo
. My first code is on step-1-node-structs
is where I added the tokens (which I called nodes for a minute).
The code has two main components: the tokenizer and the interpreter.
Tokenizer
The tokenizer takes an input string and returns tokens. Tokens are basically types internal to the interpreter, that mean something to it. Transforming an input string into a series of tokens is the first step towards interpreting it. My token class looks like this:
pub enum Token { | |
LeftParen, | |
RightParen, | |
Operator(Opcode), | |
Operand(isize) | |
} | |
pub enum Opcode { | |
Add, Subtract, Multiply, Divide, | |
} |
What we’re saying is that there are four types of things that mean something in this program: left parens, right parens, operators, and operands. We need left and right parens to delimit expressions, and we need operators and operands to operate on. Operators themselves come in four types: the four arithmetic operations. Operands are defined as an isize
, which is rust for “signed integer of the native pointer length for the processor we’re on.” This is an integer64 on my laptop.
We can think of the tokenizer as a function that takes a string (the input program) and turns it into an array of tokens that mean something to the rest of the program. For this step, I’m using the crate nom
, which lets me write functions like “pull digits from the front of the string until you hit something thats not a digit, then parse that substring as an integer,” but without a bunch of nasty for loops.
Interpeter
The other part of this program is the interpreter. The interpeter takes an array of tokens and evaluates what they mean. It is a shift-reduce parser.
A shift-reduce parser has two operations: shift, which takes one more token from the beginning of the array of unscanned tokens, and reduce, which simplifies some number of tokens and replaces them with the result of the simplification. For example, let’s say we want to compute (+ (* 2 3) (* 2 1))
. The execution would look like this:
- Shift – push the
(
onto the stack. Remaining:+ (* 2 3) (* 2 1))
- Shift – push the
+
operator onto the stack. Remaining:(* 2 3) (* 2 1))
- Shift – push the
(
onto the stack. Remaining:* 2 3) (* 2 1))
- Shift – push the
*
operator onto the stack: Remaining:2 3) (* 2 1))
- Shift – push the
2
. Remaining:3) (* 2 1))
- Shift – push the
3
. Remaining:) (* 2 1))
- Shift – push the
)
. Remaining:(* 2 1))
- Reduce! The top of the stack is now
(* 2 3)
, which can simplify to six. Pop the stack until we see a left paren, then simplify that expression. Push6
onto the stack. - Shift – push the
(
. Remaining:* 2 1))
- Shift – push the
*
. Remaining:2 1))
- Shift – push the
2
. Remaining:1))
- Shift – push the
1
. Remaining:))
. - Shift – push the
)
. Remaining)
. - Reduce! The top of the stack is now
(* 2 1)
which reduces to2
. Push2
. - Shift – push the
)
. Nothing remaining. - Reduce! The top of the stack is now
(+ 6 2)
which reduces to8
. Return8
.
So that’s what we’re building this month! A tokenizer and a shift reduce parser. Next time, we’ll take a look at how the tokenizer actually works.
Till then, happy learning!
-Will
[…] second post in a series about building a calculator REPL in Rust. You may want to start with the first post. Today I’ll talk about how the tokenizer is […]
LikeLike
[…] and final part of a series on building a simple calculator REPL in Rust. (You may with to read part one and part two […]
LikeLike
[…] This particular attempt at building a Lisp didn’t work out, but you can check out my more recent attempt if you’re […]
LikeLike
[…] installment in my discussion of building a calculator REPL in Rust. (For convenience, here are part 1 part 2 and part 3 .) I’ve decided that I like this project a lot, and that I’m going to […]
LikeLike