Hacking On Magpie
If you'd like to contribute to Magpie, or just poke around at its internals, this guide is for you. A word of warning before we start: Magpie is under heavy development. As soon as I save this doc, I'll probably edit some code and invalidate something here. I'll try to keep this up-to-date, but understand that there will likely be some lag.
The Magpie interpreter is written in straight Java. It has no external dependencies and doesn't really do any magical stuff in the language. Since it changes so frequently, the code isn't as well-documented as I'd like, but it's getting there. You should be able to figure out what's going on without too much trouble.
/base: Contains the base library: the Magpie code automatically run at startup.
/doc: Documentation, of course. Lots of notes to myself and stuff that may be outdated, but may be useful.
/doc/site: The markdown text and python script used to build this site.
/old: The old pre-Java, C# statically-typed version of Magpie.
/script: Example Magpie scripts.
/spec: The Magpie spec, the executable specification that defines and verifies the language.
/src/com/stuffwithstuff/magpie: The Java source code of the interpreter.
The code for the interpreter is split out into six namespaces:
com.stuffwithstuff.magpie: Classes that define the visible programmatic interface to the language. If you host Magpie within your own app, you'll talk to Magpie through here.
com.stuffwithstuff.magpie.app: Stuff for the standalone application:
main(), the REPL, file-loading, etc.
com.stuffwithstuff.magpie.ast: Classes for the parsed syntax tree. The code here defines the data structures that hold Magpie code.
com.stuffwithstuff.magpie.interpreter: The interpreter. The heart of Magpie is here.
com.stuffwithstuff.magpie.intrinsics: Built-in methods in Magpie that are implemented in Java live here.
com.stuffwithstuff.parser: The lexer and parser. This takes in a string of code and spits out AST.
com.stuffwithstuff.util: Other random utility stuff.
The simplest way to understand Magpie's code is to walk through it in the order that it gets evaluated.
We start in
Magpie:main(), of course. All it does is parse the command-line args and then either start a REPL, run the test suite, or execute a script. Let's assume we're running a script for now.
Once we've loaded a .mag file into a
String, it gets passed to the lexer:
Lexer. That class takes in a
Stringand chunks it into a series of
Tokens. A Token is the smallest meaningful chunk of code: a complete number, keyword, name, operator, etc.
Tokenstream is fed into the parser:
MagpieParser. This is a simple recursive-descent parser with a fixed amount of lookahead. This class contains the core Magpie grammar. Generic parsing functionality is in its base class:
Parser. Most of Magpie's high-level grammar is split out by keyword. When
MagpieParserencounters a keyword like
class, it passes off functionality to an
ExprParserthat is registered to that keyword. Eventually, these will likely be implemented in Magpie so that its syntax can be extended by users.
The end result of this is single expression or list of expressions. The source code is now in a form that Magpie can understand. Each expression will be a subclass of the base
Exprclass. There are subclasses for all of the core expression types (i.e. the things that don't get desugared away): literals (
StringExpr, etc.), calls (
CallExpr), flow control (
ReturnExpr, etc.), etc.
The actual set of
Exprsubclasses is in flux. It isn't well-defined what becomes a real AST node, and what gets desugared by the parser into something simpler. Having fewer AST classes makes the interpreter simpler to implement. Having more makes it easier to create good error messages. What will likely happen over time is that this will be split into two levels: a rich AST set that is very close to the text syntax. That will get translated to a much simpler core syntax (basically just messages and literals) which will be what the interpreter or compiler sees.
Next, we create an
Interpreterto actually interpret the
Exprs. It creates a global
Scope, and defines the built-in types in it:
Interpreterthen registers the built-in methods on those types. For each built-in Magpie class (
Int), there is a corresponding static Java class that has the built-in methods for it (
IntBuiltIns). Each method in that class has a
Signatureannotation that describes how the method looks to Magpie.
BuiltInuses reflection to find all of those methods and make them available to be called from Magpie.
Now we've got a live Magpie environment we can start running code in, but it's still pretty empty. Things like interfaces aren't defined in Java, they're in the base library. So before we can do useful stuff, we need to load the base lib. This is done automatically by
Scriptbefore it runs the user-provided code.
Finally we can throw our code at the interpreter. We pass it our parsed expressions. It creates an
EvalContextwhich defines the context in which code is executed: the local variable scope (and its parent scopes), as well as what
thisrefers to. Code is always evaluated with an
It then creates an
Expruses the Visitor Pattern to allow operations on the different expression types without having to put that code directly in the
ExprEvaluatoris one of the two visitors in Magpie. It, as you'd expect, evaluates
Exprs. This is where the actual Magpie code gets interpreted.
Right now, Magpie is a tree-walk interpreter: it interprets code by recursively traversing an expression tree and evaluating the nodes. At some point, it will likely compile to bytecode instead. When that happens,
ExprCompiler, which will walk the AST and generate bytecode.
We pass our expressions to evaluate to the evaluator which evaluates them. Ta-da! We've run Magpie code.