jlox-rs (GitHub)

So this dude called Robert Nystrom wrote a book called Crafting Interpreters. You should totally get it. It's amazing.

It walks you through every small piece of creating an interpreter for a small programming language called Lox. The first ~third of the book guides us through building a tree-walk interpreter in Java, called jlox. I followed along with it, but instead of Java, wrote the code in Rust. Thus the name: jlox-rs, because I have the creativity of a small handful of dried moths. Rust can be compiled to JavaScript, and after sprinkling a layer of web magic, you can now use this site to poke at my interpreter!

Challenges

The book sets a number of challenges. I've taken on all (?) of these, so the interpreter here has a few features not part of "standard" Lox:

  • You get a nice, friendly error message (thanks to a thing called an "error production") if you use a binary operator with a missing left-hand-side value.
  • You get a nice error message if you try to divide by zero.
  • When you concatenate (with +) anything to a string, that thing gets cast to a string.
  • Not depicted here: when run in REPL mode, the value of the last evaluated expression in an input is automatically printed.
  • You get a nice, friendly error message when you try to access a declared, but uninitialized variable.
  • break works inside loops as you'd expect (and errors out when used outside loops)
  • Anonymous functions exist: you can var f = fun() { return 3; }; print f();
  • You get a static analysis error if a local variable is not used (except in the REPL).
  • Local variables are accessed in O(1) time by binding them in the variable resolution pass: each gets an index, and the variables are accessed in a Vec by index at runtime; instead of by name, in a (hash)map. By the way, this made everything after it way more complex :D
  • Here's one way this optimization made things more complex: when running as a REPL, each new input counts as starting from line `0`. That means variable bindings that say "this variable was declared at `(0:10)`" need to also account for which command the declaration was in.
  • Static methods exist. See the advanced_classes example.
  • Getters also exist! And inheritance works with them!

Rust things

The differences between Java and Rust led to a few interesting results:
  • The Java implementation uses Java to generate Java code to DRY the AST. Naturally, I used Rust macros for this (and a few other conveniences).
  • The Java implementation stores references to Environments (basically, runtime variable scopes) in multiple places; most notably, closures are implemented by "just" storing another reference to the right environment. In Rust, the least painful way I could find was wrapping environments in Rc<RefCell<_>>.
  • It's not just environments! All runtime values can appear in various positions where multiple references to them must exist at the same time; so basically EVERY runtime value / variable primarily exists as a Rc<RefCell<_>>.
  • The Result type in Rust is awesome. I used it to handle all kinds of errors, and the thiserror library to raise specific error types. The book uses both booleans to store success states in some cases, and exceptions in some others; translating those took a little thinking (not too much).
  • While I was at it, I added the location of the error to error messages. As the book hints, this is most optimally done by storing code locations as byte offsets most of the time, and only resolving them to line/column numbers when an error is printed to the user. (So that's what I did.)
  • There are a few enums with lots of variants. Sometimes code needs / wants to operate on a specific variant. I didn't find a great solution to this: mostly the data of each variant is now a stand-alone struct, which works, but it feels a bit clumsy. It also gets a bit tangled up with all the Rc<RefCell<_>>.

Other things

  • I added tests for features I implemented as I went along; this really helped out during some hairy refactors.
  • The Rust-to-WebAssembly pipeline is now fairly mature, I had very few problems setting it up.
  • The code editor is Monaco-Editor, which is essentially the same thing that's used in VS Code. For syntax highlighting, I took one of the example Monaco language tokenizers, and tweaked it until it matched this Lox variant. So I guess you could say it's custom!
  • There are a number of things in the implementation I'm not quite happy with, but this is good enough to stop, and move on to the next part of the book!