Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Generate AST for the DLR

4.75/5 (13 votes)
1 Aug 2008CPOL7 min read 1   365  
Learn how to generate your own code for the DLR using an Abstract Syntax Tree.

Introduction

The Dynamic Language Runtime (DLR) is a layer on top of the .NET Framework 3.5 aiming at helping to build dynamic languages in .NET. The DLR is the heart of a lot of dynamic languages delivered by Microsoft: IronPython, IronRuby, Managed JScript, VBX, … The DLR is provided in the form of two DLLs: Microsoft.Scripting.dll and Microsoft.Scripting.Core.dll.

astdlrtest

One of the major features of the DLR is that it allows you to easily build a set of statements and generate it as IL code. So, in this article, I will show you how, using an Abstract Syntax Tree, you could generate your own code with the DLR.

Abstract Syntax Tree

Abstract Syntax Tree, shortly AST, is a way to hierarchically represent a program in memory. Most of the compilers or interpreters use AST to temporarily store a program read from a source file. In an AST, each node is an element: statement, operator, or value. Here is a simple example of an AST that I'll use later:

astdlrtest2.png

Like its name suggests, the namespace Microsoft.Scripting.Ast in the DLR contains AST nodes for each element usable in a language. This article will explore the major existing AST nodes in the DLR.

Create a program

The first thing to do to build an AST for the DLR is to create a root LambdaBuilder object. This LambdaBuilder object will contain all the statements for your program, and will be translated to a LambdaExpression:

C#
// Create a lambda expression builder
LambdaBuilder program = AstUtils.Lambda(typeof(object), "ASTDLRTest");

The Lambda method is a factory method. You must give it a data type and a name as parameters, but they are not used here.

Assign statement

Let's now see how I could create some very simple statements to assign values to variables, like in the following sample:

C#
n = 4;
r = n;

Statements in the DLR are represented by classes which derive from the Expression class. You can handle a set of statements using a generic List.

C#
// Create a set of statements
List<Expression> statements = new List<Expression>()

To handle variables with the DLR, you create it using the Lambda Expression method CreateLocalVariable. Two parameters are needed here: the variable name and the data type.

C#
// Create two variables
Expression n = program.CreateLocalVariable("n", typeof(int));
Expression r = program.CreateLocalVariable("r", typeof(int));

So, I can now assign values to these variables:

C#
// n = 4
statements.Add(
    Expression.Assign(
        n,
        Expression.Constant(4)
    )
);

// r = n
statements.Add(
    Expression.Assign(
        r,
        n
    )
);

Two other factory methods are used here. The Constant method is used to create a constant expression from a literal value. The Assign method is used to assign a value to a variable.

Expressions

Expressions are created in a very similar way, using a factory method for each sort of operation. Let's see how I'm going to build the following expression:

C#
r = r * (n - 1);

Here is the DLR code to call:

C#
// r = r * (n - 1)
statements.Add(
    Expression.Assign(
        r,
        Expression.Multiply(
            r,
            Expression.Subtract(
                n,
                Expression.Constant(1)
            )
        )
    )
);

Note that there is no need to use parenthesis statements with AST because there is no ambiguity regarding how to evaluate this expression.

Call a .NET method

It could be great now to display the content of our variable r. Something like this:

C#
Console.WriteLine(r);

The DLR allows you to easily interoperate with .NET and the CLR. Following is the signature for the Expression.Call factory used to call a CLR method:

C#
MethodCallExpression Expression.Call(MethodInfo method, 
                        params Expression[] arguments);

If you have already played with the Reflection API in .NET, you probably know that MethodInfo is the structure representing a method description. So, to retrieve the Console.WriteLine description, you should use the .NET Reflection API like:

C#
MethodInfo consoleWriteLine = 
  typeof(Console).GetMethod("WriteLine", new Type[] { typeof(int) });

Then, you can use our previous Expression.Call factory to write the content of our variable r:

C#
statements.Add(
    Expression.Call(
        consoleWriteLine,
        r
    )
);

Loops

Let's see now the means to do loop statements, and more precisely, a while. Here are the statements I like to generate:

C#
while (n>1)  { r = r * (n-1); n = n - 1; }

To write these statements using the DLR, I just need to call a new method factory, While. Here is how:

C#
statements.Add(
    AstUtils.While(
        Expression.GreaterThan(
            n,
            Expression.Constant(1)
        ),
        Expression.Block(
            Expression.Assign(r, Expression.Multiply(r, 
                                 Expression.Subtract(n, Expression.Constant(1)))),
            Expression.Assign(n, Expression.Subtract(n, AstUtils.Constant(1)))
        ),
        AstUtils.Empty(span)
    )
);

Another node of interest here is the Block node. This node provides a way to compose several statements in one go. Here, it allows me to do the inner block for the while statement.

Custom function

Hey, wait a minute, I know the name for the previous set of statements: it's factorial! With these statements, we've just computed the factorial value for the number n. It could be great now to write a custom function for it.

This is a two step process:

  • Create the function: create the AST for the function code.
  • Call the function: create the AST to call the function with the required arguments.

Create the function

In the DLR, a function is a Lambda Expression, so to create my new function, I first need a new Lambda Expression:

C#
// Create lambda expression
LambdaBuilder function = AstUtils.Lambda(typeof(int), "fact");

// Declare parameter
Expression n = function.CreateParameter("n", typeof(int));

This time, the Lambda Expression is created with the function return type (integer) and the function name ("fact"). Plus, I need to declare all the parameters for the function. Here, there is just one: "n" is the value to use to compute the factorial.

The source code for the fact function is built on a very similar way:

C#
// Create statements
List<Expression> statements = new List<Expression>();

// r = n
Expression r = function.CreateLocalVariable("r", typeof(int));
statements.Add(
    Expression.Assign(
        r,
        n
    )
);

// while(n>1) { r = r * (n-1); n = n - 1; }
statements.Add(
    AstUtils.While(
        Expression.GreaterThan(
            n,
            Expression.Constant(1)
        ),
        Expression.Block(
            Expression.Assign(r, Expression.Multiply(r, 
                    Expression.Subtract(n, Expression.Constant(1)))),
            Expression.Assign(n, Expression.Subtract(n, AstUtils.Constant(1)))
        ),
        AstUtils.Empty(span)
    )
);

// return r
statements.Add(
    Expression.Return(r)
);

The only new thing is that I now use a return statement to return the computed value to the calling function.

Then, I need to assign statements to the Lambda Expression and translate it to an Expression.

C#
// Add statements
function.Body = AstUtils.Block(span, statements);

// Factorial as expression
Expression factorialExpression = function.MakeLambda();

Finally, I need to declare my fact function in the program. With the DLR, a function is just a variable with a Lambda Expression as a value. So, here is how to declare the fact function:

C#
// fact = function(n) { ... }
Expression fact = program.CreateLocalVariable("fact", typeof(object));
statements.Add(
    Expression.Assign(
        fact,
        factorialExpression()
    )
);

That's all. A new fact function now exists in my program.

Call the function

Calling a custom function is slightly more complex than calling a CLR method. I must add some DLR stuff to handle a dynamic call. Here is how:

C#
// fact(5)
Expression.Action.Call(
    this.Binder,
    typeof(int),
    Expression.CodeContext(),
    fact,
    Expression.Constant(5)
);

Note that I use the Expression.Action.Call factory with two main arguments (at the end): the fact variable, and the value to pass to the function (5 here). I must also add the Binder for the language (we'll see it shortly), the return type of the call, and the current code context.

Because calling fact without retrieving its result is not very interesting, I finally embed this call into a print call:

C#
statements.Add(
    Expression.Call(
        consoleWriteLine,
        Expression.Action.Call(
            this.Binder,
            typeof(int),
            Expression.CodeContext(),
            fact,
            Expression.Constant(5)
        )
    )
);

Hosting AST

To run the AST in the previous samples, I need a new DLR language and a hosting process. Let's summarize how the full process works, to explain why:

  1. A host .NET application needs to run a script, it calls the DLR. This host could be Silverlight, ASP.NET, or any other process. The script could be written in any dynamic language based on the DLR.
  2. The DLR calls the right engine for the language used in the script.
  3. This engine parses the script, translates it to an AST, and passes it back to the DLR.
  4. Finally, the DLR translates the resulting AST into IL code and runs it.

So, the source code included with this article defines a new language called ASTDLRTest. To do that, I've implemented two classes: a language context and a binder. See the class diagram below:

astdlrtest3.png

I've just provided a very minimal implementation here. The ParseSourceCode method (the one called by the DLR) always returns the previous AST, whatever the input be.

Below is the source code for my host:

C#
static void Main(string[] args)
{
    // Create runtime
    ScriptRuntime runtime = ScriptRuntime.Create();

    // Load Engine
    ScriptEngine engine = 
      runtime.GetEngine(typeof(ADT.DLR.ADTLanguageContext));

    // Execute command
    ScriptSource src = engine.CreateScriptSourceFromString("<any />");
    src.Execute();

    // Shutdown engine
    engine.Shutdown();
}

Once again, it can't be easier. It creates the DLR's ScriptRuntime, then loads the engine for my new language, and finally, executes a script (any string could be used because my language always return the same AST).

With these few classes, you can now play with the AST's DLR, like in all the previous samples.

Conclusion

This article explains how, using AST, you could generate your own code for the DLR. However, it's only one part of the process of creating your own language. The other part is to create your own parser to generate the AST corresponding to an input script.

If you want to learn more on this, you could have a look at MyJScript, a JavaScript like DLR language that I've created from scratch as a tutorial on the DLR. Another interesting sample is ToyScript, a BASIC like language, downloadable with IronPython.

Finally, note that this article is based on DLR beta 3, some changes could appear in the final version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)