← Back to blog

How to write a (Lisp) Interpreter (in Dart)

Oleksandr Leushchenko

Senior Staff Engineer

25th July, 2023

Need a custom tool? Invertase can help

Tired of investing in off-the-shelf software that doesn't quite fit your business? Invertase can build solutions tailored to your exact needs. Tell us what you're looking for here and we'll be in touch.

Dart has earned its reputation for being gentle on beginners and yet robust enough for seasoned coders. Its elegance lies in its familiar simplicity. If you have ever programmed in any C-like language, you’re already familiar with Dart. Beginners won’t be overwhelmed with the trials of memory management, intricate data structures, or convoluted syntax.

But as any student knows, embracing diversity only deepens understanding. While complementary languages like Swift or Kotlin offer their perspectives, why not venture off the beaten track? Why not something enigmatic, like Lisp?

Lisp carries an aura of impracticality among many developers. It’s unconventional, a realm where parentheses are the king; it forces a shift in how you envision code structuring. And yet, this so-called impracticality can be a hidden treasure, offering a fresh lens that could level up your Dart skills.

I believe in learning by doing. Inspired by Peter Norvig’s article “(How to Write a (Lisp) Interpreter (in Python)),” where he introduces us to the magic of creating a Scheme (a Lisp dialect) interpreter in Python, I embarked on a similar journey. Only this time, I traded Python for Dart. What started as a weekend project became an adventure I’m eager to share.

Scheme express course

The basic unit of Scheme code is the expression, and every expression is enclosed in parentheses. Function calls are written by putting the function name first, followed by the arguments.

Let’s take a look at an example:

(define x 42)

In this example, the define function is used to bind the value 42 to the variable x.

If you want to perform operations, the operator (which is just a function) comes first, followed by the operands. For example, that’s how you can sum numbers:

(+ 2 3 4 5)

Notice that Scheme allows you to put as many parameters as needed.

For conditional statements, the syntax is following:

(if (> x 5)
    (print "x is greater than 5")
    (print "x is not greater than 5"))

In this if expression, the condition comes first (> x 5), followed by the expression to be evaluated if the condition is true and then the expression to be evaluated if the condition is false.

That is enough knowledge to build a Scheme interpreter.

What will we build?

Let’s briefly discuss what an interpreter is and how it differs from a compiler. Both tools translate source code into machine code, but they do so in different ways. An interpreter executes the code line by line, executing each instruction as it goes. A compiler, on the other hand, translates the entire program into machine code before executing it. For our project, we’re creating an interpreter for the Scheme language.

We will split the process into four parts:

  1. Tokenizer & Parser implementations
  2. Environment definition
  3. “eval” function implementation
  4. REPL

Tokenizer & Parser

The first step in creating our interpreter is tokenization. The tokenizer takes the source code and breaks it down into ‘tokens’, the basic units of a program. These tokens can be keywords, identifiers, literals, or operators.

Consider the following Scheme code:

(define x 42)

The expression can be broken down into the following tokens:

  1. ( – the beginning of the expression
  2. define – a keyword in Scheme used for defining variables
  3. x – an identifier, which is the name of the variable
  4. 42 – the value being assigned to the variable.
  5. ) – the end of the expression

The tokenizer would parse this into a list of tokens:

['(', 'define', 'x', '42', ')']

This tokenization process is the first step in interpreting or compiling the code. Each of these tokens will be further analyzed and evaluated based on their respective roles in the program.

Let’s implement a primitive tokenizer in Dart now:

List<String> tokenize(String program) => program
    .replaceAll('(', ' ( ')
    .replaceAll(')', ' ) ')
    .split(' ')
    .where((token) => token.isNotEmpty)
    .toList();

Our interpreter will leverage Dart’s types and will support numbers, booleans, and strings. Additionally, we will need a custom type for keywords and custom names. We will call it Symbol. Let’s define it:

class Symbol {
  const Symbol(this.name);
  final String name;

  @override
  String toString() => name;

  @override
  bool operator ==(Object other) =>
      identical(this, other) ||
      other is Symbol && runtimeType == other.runtimeType && name == other.name;

  @override
  int get hashCode => name.hashCode;
}

So far, our tokens are strings. Let’s write a function that will convert raw token strings into interpreter atoms, i.e., will convert Scheme types into Dart types:

dynamic _atom(String token) {
  // parse doubles
  final d = double.tryParse(token);
  if (d != null) {
    return d;
  }
  // parse booleans
  if (token == '#t') {
    return true;
  }
  if (token == '#f') {
    return false;
  }
  // parse strings
  if (token.startsWith('"') && token.endsWith('"')) {
    return token.substring(1, token.length - 1);
  }
  return Symbol(token);
}

The tokenizer returns a list of tokens, hence let’s use the new _atom function to parse these tokens:

dynamic parseTokens(List<String> tokens) {
  if (tokens.isEmpty) {
    throw Exception('Unexpected EOF');
  }
  final token = tokens.removeAt(0);
  if (token == '(') {
    final elements = <dynamic>[];
    while (tokens.first != ')') {
      elements.add(parseTokens(tokens));
      if (tokens.isEmpty) {
        throw Exception('Unexpected EOF');
      }
    }
    tokens.removeAt(0); // remove ')'
    return elements;
  } else if (token == ')') {
    throw Exception('Unexpected ")"');
  } else {
    return _atom(token);
  }
}

Now we may complete the parser implementation:

dynamic parse(String program) => parseTokens(tokenize(program));

The first part of our interpreter implementation is done; you may play around with the parser here:

https://zapp.run/gist/ee9dbfcc37981a0cbbb9293cd57e84cf

Environment

The environment holds the association between keywords and their values. It’s where we store predefined and user-defined operators, procedures, and variables.

We may implement it via Map:

final Map<Symbol, dynamic> standardEnv = {
  Symbol('pi'): math.pi,
  Symbol('sin'): math.sin,
  Symbol('cos'): math.cos,
  Symbol('sqrt'): math.sqrt,
  Symbol('random'): (num x) => math.Random().nextDouble() * x,
  Symbol('>'): (a, b) => a > b,
  Symbol('<'): (a, b) => a < b,
  Symbol('>='): (a, b) => a >= b,
  Symbol('<='): (a, b) => a <= b,
  Symbol('='): (a, b) => a == b,
  Symbol('abs'): (num x) => x.abs(),
...
};

See the full standardEnv implementation in the code snippet further.

“eval” function

The eval function is the heart of our interpreter. It takes a token or a list of tokens and an environment and evaluates the tokens based on their meaning in the language. It’s in this function where the magic happens, where the simple tokens and the environment come together to create the rich functionality of the Scheme.

Expression evaluation will be done in the following process:

  1. Symbol is our internal special type. The operation or a constant associated with it is stored in the environment. This means we have to return it from the eval function.
  2. If the expression is one of the primitive types – return it as is.
  3. If the expression is a List (and in Lisp, pretty much everything is a list) – handle it appropriately.
  4. Alternatively – crash, as something unexpected happens.
dynamic eval(dynamic x, Map<Symbol, dynamic> env) {
  if (x is Symbol) {
    return env[x];
  }
  if (x is num || x is bool || x is String) {
    return x;
  }
  if (x is List) {
    return switch (x[0].toString()) {
      'if' => _handleIf(x, env),
      'define' => _handleDefine(x, env),
      'quote' => x[1],
      'lambda' => _handleLambda(x, env),
      'map' => _handleMap(x, env),
      _ => _handleProcedure(x, env),
    };
  }
  throw Exception('Unknown expression type: $x');
}

Let’s focus on missing handlers now.

The if the handler should be intuitively clear. The first element in the list is the predicate, the second – is the expression that must be evaluated if the predicate returns true, and the third expression returns the result of evaluation if the predicate is false.

dynamic _handleIf(List x, Map<Symbol, dynamic> env) {
  final test = x[1];
  final conseq = x[2];
  final alt = x[3];
  final exp = eval(test, env) ? conseq : alt;
  return eval(exp, env);
}

The define handler defines new variables in the environment. The first element in the list is the name of the new symbol, and the second – is the expression associated with this symbol.

dynamic _handleDefine(List x, Map<Symbol, dynamic> env) {
  final symbol = x[1].toString();
  final exp = x[2];
  env[Symbol(symbol)] = eval(exp, env);
  return null;
}

The lambda handler creates an anonymous function that captures all variables from the first expression parameter in a new local environment and evaluates the second parameter. It sounds tricky, so let’s review an example. Here is a lambda declaration (lambda (r) (* pi (* r r)))). Here r is the local variable from the local environment, pi – from the standard one, and (* pi (* r r)) is the anonymous function body. Why might we need lambda? To declare functions in the environment, like (define circle-area (lambda (r) (* pi (* r r)))).

The handler implementation is the following:

dynamic _handleLambda(List x, Map<Symbol, dynamic> env) {
  return (List arguments) {
    final localScope = <Symbol, dynamic>{
      ...Map.fromIterables(x[1].cast<Symbol>(), arguments),
      ...env,
    };
    return eval(x[2], localScope);
  };
}

The map handler is a simple one. It evaluates its second parameter, which will return a list, and then applies the function from the first parameter to each element, creating a new list. It’s similar to List.map function from Dart.

dynamic _handleMap(List x, Map<Symbol, dynamic> env) {
  final args = eval(x[2], env);
  return args.map((p) => eval([x[1], p], env)).toList();
}

The last and most important handler is the handler that runs procedures. The tricky part here is the lambdas and functions that accept a list of parameters. Unfortunately, Dart doesn’t support functions with the variadic list of parameters. In Dart, a + b must always have only two parameters, and when you need to add c you add one more + operator – a + b + c, but in Lisp, you may sum as many parameters as you need – (+ a b c). Because of that, I’ve made a hack – I wrap arguments in a list for some special cases.

dynamic _handleProcedure(List x, Map<Symbol, dynamic> env) {
  var proc = eval(x[0], env);
  var args = x.sublist(1).map((arg) => eval(arg, env)).toList();

  return Function.apply(
    proc,
    // if function is either an exceptional or a custom lambda - wrap args
    _multipleParamsFuncs.containsKey(x[0]) ||
            (proc is Function && !_strictParamsFuncs.containsKey(x[0]))
        ? [args]
        : args,
  );
}

That required some changes to the environment:

final Map<Symbol, dynamic> standardEnv = {
  ..._strictParamsFuncs,
  ..._multipleParamsFuncs
};
final Map<Symbol, dynamic> _strictParamsFuncs = {
  Symbol('pi'): math.pi,
...
};
final Map<Symbol, dynamic> _multipleParamsFuncs = {
  Symbol('begin'): (List x) => x.last,
  Symbol('list'): (List<dynamic> x) => List.from(x),
  Symbol('+'): (List x) => x.reduce((a, b) => a + b),
  Symbol('-'): (List x) => x.reduce((a, b) => a - b),
  Symbol('*'): (List x) => x.reduce((a, b) => a * b),
  Symbol('/'): (List x) => x.reduce((a, b) => a / b),
  Symbol('max'): (List x) => x.cast<num>().reduce((a, b) => math.max(a, b)),
  Symbol('min'): (List x) => x.cast<num>().reduce((a, b) => math.min(a, b)),
};

Here is the full code that we have written so far:

https://zapp.run/gist/dbaba212c81584dcce7e95e09e4e2e52

REPL

The last piece of our interpreter puzzle is a REPL or Read-Eval-Print-Loop. This is the user interface of our interpreter, where users can enter their code, see it evaluated, and get the results printed out.

A REPL is the interactive programming environment for our interpreter. As its name suggests, it performs the following steps in a loop:

  1. Read: The REPL reads input from the user. This is typically a line of Scheme code that the user has typed in.
  2. Eval: Next, the REPL evaluates the input. It uses the tokenizer to break the input into tokens and then uses the eval function to interpret these tokens in the current environment context.
  3. Print: After evaluating the input, the REPL prints out the result. This gives the user immediate feedback on the result of their code.
  4. Loop: The REPL returns to the beginning and waits for the following line of input from the user.

A REPL is a powerful tool for learning and experimentation. With it, users can write and test snippets of Scheme code, exploring the behaviour of the language and receiving immediate feedback.

import 'dart:convert';
import 'dart:io';

void main(List<String> arguments) {
  // Listen for Ctrl+C
  ProcessSignal.sigint.watch().listen((_) => exit(0));

  stdin.transform(utf8.decoder).listen(
    (program) {
      try {
        print(eval(parse(program)));
      } catch (e) {
        print(e.toString());
      }
    },
    onDone: () => exit(0),
    onError: (_) => exit(1),
  );
}

Put all code listings into a single file (let’s call it scheme_repl.dart). Now you can run it:

dart scheme_repl.dart

The interpreter will wait for your command. Try to write some simple Scheme code:

(define x 42)
null
(print x)
42.0
null
(+ 2 3)
5.0

Yay! It works! How about something more sophisticated?

(if (= 1 1) 1 0)
1.0
(begin (define fact (lambda (n) (if (<= n 1) 1 (* n (fact (- n 1)))))) (fact 10))
3628800.0
(begin (define fib (lambda (n) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))) (define range (lambda (a b) (if (= a b) (quote ()) (cons a (range (+ a 1) b))))) (map fib (range 0 10)))
[1.0, 1.0, 2.0, 3.0, 5.0, 8.0, 13.0, 21.0, 34.0, 55.0]

We’ve completed the implementation of a toy Scheme interpreter! You may find the latest version of the code here.

Conclusion

While our creation might not pass for a polished or practical interpreter implementation, it was never meant to. Our mission was to delve into learning, seek comprehension, and revel in the delight of discovery. And on those fronts, I dare say we’ve triumphed. For those eager to harness the power of Lisp in Dart’s comforting confines, ClojureDart offers a sophisticated solution.

Sharing this project with you has been a great joy, and I hope it has been as enlightening and enjoyable for you as it was for me. Remember, programming isn’t just about reaching a destination – it’s about the thrill of new challenges, the knowledge gained, and the fun had along the way.

Happy coding!

From Author

Russia started an unfair and cruel war against my country. If you found this article interesting or useful, please, donate to Ukraine’s Armed Forces. Thanks in advance to everyone who participated.

Glory to Ukraine! 🇺🇦

Stay tuned for more updates and exciting news that we will share in the future. Follow us on Invertase TwitterLinkedin, and Youtube, and subscribe to our monthly newsletter to stay up-to-date. You may also join our Discord to have an instant conversation with us.

Oleksandr Leushchenko

Senior Staff Engineer

Categories

Tags