# Write an interpreter: variables

This post is the second in a series. Last time, we started coding up a toy interpreter for a language of arithmetic expressions. This time, we’ll extend the language to include variables and update our interpreter to handle that extended language.

## A slightly less tiny grammar of expressions

Our interpreter only knows how to deal with expressions: things of type `Expr`. To be able to interpret a bigger language, we’ll need to expand the interpreter’s notion of what an `Expr` is.

Let’s add a new variant to our `Expr` type:

``````data Expr = Number Int
| Var String
| Plus Expr Expr
| Minus Expr Expr
| Times Expr Expr
``````

This is what we had before, but with one new variant added: `Var`. `Var` is a value constructor for variables, just as `Number` is a value constructor for integers. A convenient way to represent a variable is as a string, such as `"x"` or `"totalMonkeys"`, so `Var` will take a `String` argument.

With `Var` added to the definition of `Expr`, we can write down some `Expr`s with variables in them:

``````*Interp> :t Var "x"
Var "x" :: Expr
*Interp> :t Plus (Var "x") (Number 5)
Plus (Var "x") (Number 5) :: Expr
*Interp> :t Plus (Var "x") (Var "y")
Plus (Var "x") (Var "y") :: Expr
``````

GHCi reports that these expressions are indeed `Expr`s. Now we just have to teach our interpreter how to deal with them.

## Environments

When given an expression, the job of an interpreter is to return that expression’s value. This is easy enough for the interpreter to accomplish when we’re working with expressions like `9` or `(3 + 4)` or `((203 + (6 * (34 - 1))) + 7000)`. But what is the value of an expression containing a variable, like `(x + 5)`?

To interpret expressions that contain variables, merely having the expressions isn’t enough. But if we could get our hands on some extra information that would tell us that, say, the value of `x` is `42`, then we would know how to interpret `(x + 5)`, or any other expression containing `x`. This extra information is called an environment.

A simple way to represent an environment might be a list of pairs. For instance, `[("x", 42), ("y", 5)]` could represent the environment in which `x`’s value is `42` and `y`’s value is `5`. Instead of using a list of pairs, we’ll use Haskell’s `Map` data structure from the `Data.Map` library, which comes with many convenient operations – but it doesn’t really matter what data structure we use, as long as it lets us associate variables with values.

Using `Map`, let’s define our own `Env` type for environments. Since we want to map variables to values, or in our case, `String`s to `Int`s, we’ll instantiate the `Map` type with `String` and `Int`.

``````import qualified Data.Map as Map

type Env = Map.Map String Int
``````

With our `Env` type defined, we can define a function `lookupEnv` that will let us look things up in an environment. `lookupEnv` is just a thin wrapper around the `lookup` function that `Data.Map` already provides.

``````lookupEnv :: Env -> String -> Int
lookupEnv env x = case Map.lookup x env of
Just n -> n
Nothing -> error "Unbound variable!"
``````

The `lookupEnv` operation takes two arguments: `env`, the environment, and `x`, the variable we’re trying to look up. If the lookup succeeds, `lookupEnv` returns the value it found; otherwise, it raises an exception.

We haven’t yet written anything that will let us add new bindings to an environment. For now, we can roll our own environments by calling the convenient `fromList`, which takes an argument like `[("x", 42), ("y", 5)]` and produces the corresponding `Map`.

``````*Interp> lookupEnv (Map.fromList [("x", 42), ("y", 5)]) "x"
42
*Interp> lookupEnv (Map.fromList [("x", 42), ("y", 5)]) "y"
5
*Interp> lookupEnv (Map.fromList [("x", 42), ("y", 5)]) "z"
*** Exception: Unbound variable!
``````

## The `interp` function

Now that we have environments and a way to look things up in them, we’re ready to add support for variables to `interp`.

First, we’ll tweak the type of `interp`. Before, it simply took an `Expr` and returned an `Int`. Now that we have to be able to interpret variables, we’ll need to have an environment handy to look up their values in. We can pass one in as a second argument to `interp`:

``````interp :: Expr -> Env -> Int
``````

Next, in the definition of `interp`, we’ll add a new case to handle `Var` expressions. When we encounter a `Var`, we have to determine its value. Fortunately, this is easy: we just call the `lookupEnv` function we defined, passing in the environment and the variable as its arguments.

None of the other cases of `interp` will need to make use of the environment at all. But since `interp` now takes an environment argument, each of the recursive calls in the `Plus`, `Minus`, and `Times` cases will have to pass the environment along.

``````interp e env = case e of
Var x -> lookupEnv env x
Number n -> n
Plus e1 e2 -> (interp e1 env) + (interp e2 env)
Minus e1 e2 -> (interp e1 env) - (interp e2 env)
Times e1 e2 -> (interp e1 env) * (interp e2 env)
``````

We now have what’s known as an environment-passing interpreter.

We’re ready to interpret some expressions with variables. We’ll have to remember to give `interp` two arguments: first, the `Expr` to be interpreted, and second, the environment to interpret them in. How will we get an environment to pass to `interp`? We can do something like `Map.fromList [("x", 42), ("y", 5)]`, or, if we just want an empty environment, we can call `Map.empty`. Let’s define a quick shorthand for empty environments:

``````emptyEnv :: Env
emptyEnv = Map.empty
``````

Now we’re ready to call `interp`. What happens if we try to interpret an `Expr` containing variables, but we only give it `emptyEnv`?

``````*Interp> interp (Plus (Var "x") (Number 5)) emptyEnv
*** Exception: Unbound variable!
``````

The interpreter correctly reports that `emptyEnv` isn’t enough information to interpret `Var "x"`. Nevertheless, `emptyEnv` is handy for when we just want to interpret expressions without variables.

``````*Interp> interp (Times (Number 2) (Number 0)) emptyEnv
0
``````

But, when we do have variables to interpret, we need to give `interp` an environment that binds them to values.

``````interp (Plus (Var "x") (Number 5)) (Map.fromList [("x", 42)])
47
``````

Hooray! Our language now supports variables. Here’s the complete interpreter:

``````import qualified Data.Map as Map

data Expr = Number Int
| Var String
| Plus Expr Expr
| Minus Expr Expr
| Times Expr Expr

type Env = Map.Map String Int

emptyEnv :: Env
emptyEnv = Map.empty

lookupEnv :: Env -> String -> Int
lookupEnv env x = case Map.lookup x env of
Just n -> n
Nothing -> error "Unbound variable!"

interp :: Expr -> Env -> Int
interp e env = case e of
Var x -> lookupEnv env x
Number n -> n
Plus e1 e2 -> (interp e1 env) + (interp e2 env)
Minus e1 e2 -> (interp e1 env) - (interp e2 env)
Times e1 e2 -> (interp e1 env) * (interp e2 env)

``````

## Parsing

Since we can now interpret expressions containing variables, it would be convenient to be able to lex and parse them as well, so that we can write, say, `"(10 + (x + y))"` instead of `Plus (Number 10) (Plus (Var "x") (Var "y"))`.

Last time, we generated a parser for our language from a Happy grammar file. Adding support for variables requires only a couple of tweaks to that file. Happily, we can deal with lexing and parsing variables in a way analogous to how we’re already dealing with integers. The resulting Happy grammar file is a little longer, but no more complicated.

``````*Main> lexer "(10 + (x + y))"
[TokenOB,TokenInt 10,TokenPlus,TokenOB,TokenVar "x",TokenPlus,TokenVar "y",TokenCB,TokenCB]
*Main> (parser . lexer) "(10 + (x + y))"
Plus (Number 10) (Plus (Var "x") (Var "y"))
*Main> import qualified Data.Map as Map
*Main Map> (interp . parser . lexer) "(10 + (x + y))" (Map.fromList [("x", 42), ("y", 5)])
57
``````

## Next time: functions

Now that we have support for variables in our language, we’re on our way to being able to write functions that take arguments. In the next post in this series, we’ll extend our interpreter to be able to handle functions and function calls.

All of the code from this post, plus a little additional scaffolding, is on GitHub. Here’s a diff of everything that changed from last time.

Categories:

Updated: