## Virtual Machine

Recall from the crash course that a (process) VM abstracts away hardware specific instructions so that its Bytecodes (abstract instructions) can be executed in any environment that has the Bytecode runtime support. So to create our VM and its runtime, we need to define our

• Opcodes (new encoding atoms)
• Bytecode representation and
• Runtime model as Stack Machine

### Opcode

Since our expression based `Calc` language is made up of

• constant integers
• unary (plus, minus sign) operators and
• binary (addition, subtraction) operators

we can define our new opcode encodings like

``````pub enum OpCode {
OpConstant(u16), // pointer to constant table
OpPop,           // pop is needed for execution
OpSub,
OpPlus,
OpMinus,
}
``````

Filename: calculator/src/compiler/vm/opcode.rs

We choose the simplest form of encoding i.e. encoding the ops as bytes `u8` (in hex format). That is,

``````        OpCode::OpConstant(arg) => make_three_byte_op(0x01, arg),
OpCode::OpPop => vec![0x02],  // decimal repr is 2
OpCode::OpAdd => vec![0x03],  // decimal repr is 3
OpCode::OpSub => vec![0x04],  // decimal repr is 4
OpCode::OpPlus => vec![0x0A], // decimal repr is 10
OpCode::OpMinus => vec![0x0B], // decimal repr is 11
``````

For easy of access, we store constant `Node::Int(i32)` nodes in a separate memory and `OpConstant(arg)` tracks these values. In a unary expression like `"1"`, we encode `Node::Int(1)` as the opcode `[1, 0, 0]` as the first constant. (0x01 in decimal is 1).

### Bytecode

We define

``````pub struct Bytecode {
pub instructions: Vec<u8>,
pub constants: Vec<Node>,
}
``````

Filename: calculator/src/compiler/vm/bytecode.rs

and pictorially, here is how `"1 + 2"` AST to Bytecode conversion would look like

and in Rust

``````ByteCode {
instructions: [1, 0, 0, 1, 0, 1, 3, 2],
constants: [Int(1), Int(2)]
}
``````

Now, we can implement our Bytecode interpreter

``````pub struct Interpreter {
bytecode: Bytecode,
}

impl Compile for Interpreter {
type Output = Bytecode;

fn from_ast(ast: Vec<Node>) -> Self::Output {
let mut interpreter = Interpreter {
bytecode: Bytecode::new(),
};
for node in ast {
println!("compiling node {:?}", node);
interpreter.interpret_node(node);
// pop one element from the stack after
// each expression statement to clean up
}
interpreter.bytecode
}
}
``````

Filename: calculator/src/compiler/vm/bytecode.rs

### Runtime

From previous example, our interpreter goes through the bytecode instructions and executes them.

Continuing our `"1 + 2"` Bytecode example,

``````instructions: [1, 0, 0, 1, 0, 1,   3,   2],
-------- --------    -    -
|         |       |    |
constants: [  Int(1),   Int(2)]  OpAdd  OpPop

[1, 0, 0] points to the first element in constants table i.e. Int(1)
[1, 0, 1] points to Int(2)
 (or [0x03]) corresponding to the Opcode *OpAdd*, performs the addition operation Int(1 + 2)
 (or [0x02]) corresponding to the Opcode *OpPop* pops out the computed Bytecodes
``````

and since we want to model our runtime as a Stack Machine so we define our VM as struct with Bytecode, stack memory (in Stack Machine) and a stack pointer to the next free space

``````const STACK_SIZE: usize = 512;

pub struct VM {
bytecode: Bytecode,
stack: [Node; STACK_SIZE],
stack_ptr: usize, // points to the next free space
}
``````

and with the help of instruction pointer (IP), we execute the Bytecodes as follows

``````    pub fn run(&mut self) {
let mut ip = 0; // instruction pointer
while ip < self.bytecode.instructions.len() {
let inst_addr = ip;
ip += 1;

0x01 => {
//OpConst
let const_idx = convert_two_u8s_to_usize(
self.bytecode.instructions[ip],
self.bytecode.instructions[ip + 1],
);
ip += 2;
self.push(self.bytecode.constants[const_idx].clone());
}
0x02 => {
//OpPop
self.pop();
}
0x03 => {
match (self.pop(), self.pop()) {
(Node::Int(rhs), Node::Int(lhs)) => self.push(Node::Int(lhs + rhs)),
_ => panic!("Unknown types to OpAdd"),
}
}
0x04 => {
// OpSub
match (self.pop(), self.pop()) {
(Node::Int(rhs), Node::Int(lhs)) => self.push(Node::Int(lhs - rhs)),
_ => panic!("Unknown types to OpSub"),
}
}
0x0A => {
// OpPlus
match self.pop() {
Node::Int(num) => self.push(Node::Int(num)),
_ => panic!("Unknown arg type to OpPlus"),
}
}
0x0B => {
// OpMinus
match self.pop() {
Node::Int(num) => self.push(Node::Int(-num)),
_ => panic!("Unknown arg type to OpMinus"),
}
}
_ => panic!("Unknown instruction"),
}
}
}

pub fn push(&mut self, node: Node) {
self.stack[self.stack_ptr] = node;
self.stack_ptr += 1; // ignoring the potential stack overflow
}

pub fn pop(&mut self) -> Node {
// ignoring the potential of stack underflow
// cloning rather than mem::replace for easier testing
let node = self.stack[self.stack_ptr - 1].clone();
self.stack_ptr -= 1;
node
}
``````

Filename: calculator/src/compiler/vm/vm.rs

To examine the generated Bytecodes and run our VM, we can do

``````let byte_code = Interpreter::from_source(source);
println!("byte code: {:?}", byte_code);
let mut vm = VM::new(byte_code);
vm.run();
println!("{}", vm.pop_last());
``````

Run tests locally for our VM with

``````cargo test vm --tests
``````

Checkout the next section on how to create a REPL for our `Calc` to compare different compilation paths.