Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Debugging Your Language

Before we move on, let’s establish debugging techniques you’ll use throughout this book. Language implementations have many moving parts - knowing how to inspect each layer saves hours of frustration.

The Golden Rule

Your language is a pipeline. Data flows through stages: Source → Tokens → AST → Output. When something breaks, find which stage produced the wrong output.


dataflow: source code to result

Debug by checking each arrow: Is the input to this stage correct? Is the output?

The AST is your program’s structure. When behavior is wrong, print it:

// In your code
let ast = parse(source)?;
println!("{:#?}", ast);  // Pretty-print with {:#?}

Output:

Binary {
    op: Add,
    left: Int(1),
    right: Int(2),
}

If this looks wrong, your parser has a bug. If it looks right, the bug is in execution.

Check Operator Precedence

A common parsing bug: 1 + 2 * 3 parses as (1 + 2) * 3 instead of 1 + (2 * 3).

let ast = parse("1 + 2 * 3")?;
println!("{:#?}", ast);

// Should be: Add(1, Mul(2, 3))
// Bug if:    Mul(Add(1, 2), 3)

Fix: Check your grammar’s precedence rules. In PEG, earlier alternatives have higher precedence.

Test Small, Test Often

Don’t write 100 lines then debug. Test each feature in isolation:

#[test]
fn test_addition() {
    assert_eq!(eval("1 + 2"), 3);
}

#[test]
fn test_subtraction() {
    assert_eq!(eval("5 - 3"), 2);
}

#[test]
fn test_combined() {
    assert_eq!(eval("1 + 2 - 3"), 0);
}

When a test fails, you know exactly what’s broken.

Use the REPL

The REPL is your best friend for quick experiments:

>>> 1 + 2
3
>>> 1 + 2 * 3
7
>>> (1 + 2) * 3
9

If a complex expression fails, simplify until you find the minimal failing case.

When Using LLVM (Later)

When we add LLVM compilation, two more techniques become useful:

// After code generation
codegen.module.print_to_stderr();

You’ll see LLVM IR:

define i64 @add(i64 %a, i64 %b) {
entry:
  %result = add i64 %a, %b
  ret i64 %result
}

If this looks wrong, your codegen has a bug.

Use LLVM’s Verifier

codegen.module.verify().map_err(|e| {
    eprintln!("LLVM verification failed: {}", e);
    e.to_string()
})?;

The verifier catches:

  • Missing terminators (every basic block needs ret or br)
  • Type mismatches
  • Invalid instructions

Always verify before JIT execution!

How Errors Surface Today

Our languages report errors as values, not exceptions. Each stage hands back a Result: the calculator uses anyhow::Result<T>, while Firstlang, Secondlang, and Thirdlang return Result<T, String>. There are three kinds you will meet:

  • Parse errors come from pest and already carry a line and column:
Parse error:  --> 8:9
  |
8 |     def classify(self) -> int {
  |         ^---
  |
  = expected Identifier
  • Type errors come from the type checker as plain strings, for example Type mismatch: expected int, got bool.

  • Setup and I/O errors, such as pointing the CLI at a file that does not exist.

The CLIs for Firstlang, Secondlang, and Thirdlang print these with eprintln! and exit non-zero. The calculator is the exception: its entry point still unwrap()s the file read and the parse result, so a missing file or a syntax error there ends in a panic rather than a tidy message.

Two limits are worth naming. First, type-error strings have no source location, so they tell you what is wrong but not where. Second, the first error stops the pipeline; you fix one, rerun, and find the next. Real compilers attach a span to every diagnostic and recover to report several at once. We sketch how to get there in What’s Next.

The Debugging Mindset

Think like a detective. You have a crime (wrong output). You need to find where in the pipeline the crime occurred. Interrogate each stage until you find the culprit.

  1. Reproduce - Find the smallest input that triggers the bug
  2. Isolate - Which stage is producing wrong output?
  3. Inspect - Print the data at that stage
  4. Fix - Change the code
  5. Verify - Run your test again

This systematic approach works for any compiler bug.

Quick Reference

ProblemDebug Technique
Wrong resultPrint AST, check structure
Parse errorSimplify input, check grammar
Precedence wrongPrint AST, check grammar order
LLVM crashPrint IR, run verifier
Infinite loopAdd print statements in eval loop

Now you have the tools. Let’s continue building!