Sunday, March 15, 2009

Writing a Compiler with F# Part 2

Since the ideas I want to mess with are more related to type systems than, say, instruction selection or optimizing passes, I'm going to completely neglect the backend of the compiler for awhile. Initially I'm not going to compile the language at all. Instead I'll just build a simple interpreter to evaluate the type-checked parse tree.

The easiest way to play with this is to make a basic REPL (Read Evaluate Print Loop). Basically a prompt that you can type language snippets into and have them run and the results printed. For F#, the REPL is called fsi.exe. Other languages have this feature as well (in Ruby it's irb.exe).

To get started, I'm going to forget about all the lexing and parsing stuff and get my feet wet with F# by building the framework for a simple REPL.

First, I'll need to print a prompt and get some input. Here's an F# function to handle that:

let getInput (prompt:string) =
(Console.Write(prompt); Console.ReadLine())

Pretty basic stuff. I had to specify the type for "prompt" since Console.Write is overloaded.

Next, I want a basic loop to get input and evaluate it. Functional programming and F# in particular work naturally with sequences as a first class element. To that end, I'm going to model the user input as a sequence of lines. The F# Seq type has a generate method for this purpose. It takes 3 functions: an initializer that returns any data you might need, a generator that given the initial data produces either another element or None to signify the sequence has ended, and a cleanup function. Initially, I don't need any supporting data so both the initialize and cleanup functions can do nothing. Here's what it looks like:

let input = Seq.generate
(fun () -> ()) // init
(fun () ->
match getInput(">") with
| "#quit" -> None
| inp -> Some(inp))
(fun () -> ()) // cleanup

The middle function uses getInput that was defined above and asks the user for input. If they type "#quit" then the sequence ends, otherwise the input is returned.

Now we'll take this sequence and do something with it. Ultimately we'll want to evaluate the input and then keep prompting. Since the statement run might result in new variables being declared or existing ones modified, we'll need some way to remember what happened. The state of our world is referred to as an environment, so we'll write a dummy eval function that takes an environment, evaluates input, and returns the new environment to pass along to the next statement. For now let's use this:

let eval (env:int) (x:string) =
(Console.WriteLine("Evaluating: {0}", x);env)

We ignore the environment, just print out the input, and return the environment unchanged.

The last step is to apply this eval function to our sequence of user input, passing the modified environment forward to the next element. This is a classic case of a fold. We start with an initial environment, call eval on the first item in the sequence, and then pass the resulting environment in when calling eval on the second item. Then the environment returned by eval'ing the second item gets passed into the third, etc. Here's what this looks like:

Seq.fold eval 0 input |> ignore

That's it. We don't care about the final value of the environment so we just pipe it to ignore. With only a few lines of F# we have a basic REPL going and we're ready to start with the real language work next time.

No comments: