Abstract Syntax Tree (AST)
AST comes into picture when we want to go from the string representation of our program like "-1"
or "1 + 2"
to something more manageable and easier to work with. Since our program is not a random string (the grammar is for), we can use the structure within the expressions "-1"
and "1 + 2"
to our own advantage and come up with a new representation like a tree
One thing to note here is that the kinds of the nodes in our tree are not the same i.e. +
node is different from 1
node. In fact, +
has an Operator type and 1
is an integer Int type
so we define our AST nodes as
#![allow(unused)] fn main() { pub enum Operator { Plus, Minus, } pub enum Node { Int(i32), } }
Referring back to our grammar, we actually have different kinds of recursive expressions;
- unary grammar
UnaryExpr = { Operator ~ Term }
- binary grammar
BinaryExpr = { Term ~ (Operator ~ Term)* }
So for example, the expression "-1 + (2 + 3)"
has this recursive structure
To include those into our AST to make it an actual tree data structure, we complete our AST definition as follows
#![allow(unused)] fn main() { pub enum Operator { Plus, Minus, } pub enum Node { Int(i32), UnaryExpr { op: Operator, child: Box<Node>, }, BinaryExpr { op: Operator, lhs: Box<Node>, rhs: Box<Node>, }, } }
Filename: calculator/src/ast.rs
Now, we can use the pest
generated CalcParser::parse
to map the Rules of our Calc
language string to our AST.
pub fn parse(source: &str) -> std::result::Result<Vec<Node>, pest::error::Error<Rule>> {
let mut ast = vec![];
let pairs = CalcParser::parse(Rule::Program, source)?;
for pair in pairs {
if let Rule::Expr = pair.as_rule() {
ast.push(build_ast_from_expr(pair));
}
}
Ok(ast)
}
Checkout calculator/src/parser.rs.
Note that CalcParser::parse
takes care of the AST traversal and correctly maps it to Vec<Node>
for easier access
in later stages of compilation.
Interpreter
CPU is the ultimate interpreter. That is, it executes opcodes as it goes. To do that, after we have changed the representation (aka lowered the representation) of our source code &str
to AST Node
, a basic interpreter looks at each node of the AST (via any tree traversal methods) and simply evaluates it recursively
pub fn eval(&self, node: &Node) -> i32 {
match node {
Node::Int(n) => *n,
Node::UnaryExpr { op, child } => {
let child = self.eval(child);
match op {
Operator::Plus => child,
Operator::Minus => -child,
}
}
Node::BinaryExpr { op, lhs, rhs } => {
let lhs_ret = self.eval(lhs);
let rhs_ret = self.eval(rhs);
match op {
Operator::Plus => lhs_ret + rhs_ret,
Operator::Minus => lhs_ret - rhs_ret,
}
}
}
}
To sum up, we define a Compile
trait that we will use throughout this chapter
pub trait Compile {
type Output;
fn from_ast(ast: Vec<Node>) -> Self::Output;
fn from_source(source: &str) -> Self::Output {
println!("Compiling the source: {}", source);
let ast: Vec<Node> = parser::parse(source).unwrap();
println!("{:?}", ast);
Self::from_ast(ast)
}
}
and we can now implement our interpreter
pub struct Interpreter;
impl Compile for Interpreter {
type Output = Result<i32>;
fn from_ast(ast: Vec<Node>) -> Self::Output {
let mut ret = 0i32;
let evaluator = Eval::new();
for node in ast {
ret += evaluator.eval(&node);
}
Ok(ret)
}
}
Filename: calculator/src/compiler/interpreter.rs
and test
assert_eq!(Interpreter::from_source("1 + 2").unwrap(), 3);
Run such tests locally with
cargo test interpreter --tests