Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Making .NET Applications Scriptable with Aphid, an Embeddable Scripting Language

0.00/5 (No votes)
22 Apr 2018 1  
This article details how to make .NET applications scriptable with Aphid, an embeddable scripting language.

Important Updates

Update 4/22/2018

Aphid has undergone extensive development over the last year, with over 500 commits since the last update. Far removed from the embedded scripting language it was planned to be, Aphid is now a full-featured .NET programming language with fully inferred, seamless CLR interop that includes support for generics and delegates. Explicit generic type references are supported as well, allowing for instantiation of types such as List<T> and Dictionary<TKey, TValue> without resorting to reflection. Also worth noting is the many development tool improvements, including the overhauled REPL, with syntax highlighting and detailed autocomplete, Visual Studio Code extensions for syntax highlighting and debugging, and Visual Studio syntax highlighting and error checking that is now compatible with VS2012 up to 2017.

For a more detailed list of the changes introduced by this major revision, see the History section at the end of this article.

Update 4/26/2016

This project has been moved to GitHub: https://github.com/John-Leitch/Aphid/.

The releases attached to this page may not be the most recent. To get the latest version of Aphid, visit the GitHub releases page.

Note: for technical reasons, the online editor will not see updates for some time. The current version is online for legacy reasons and is not recommended.

For evaluation purposes, an online editor has been created. It can be found here: http://autosectools.com/Try-Aphid-Online

Table Of Contents

  1. Introduction
  2. Why Another Scripting Language?
  3. Syntax
    1. Basics
    2. Program Structure
    3. Control Structures
  4. Type System And Types 
  5. Hello World
  6. Lists
  7. Functions
    1. Higher-order Functions
    2. Partial Function Application
    3. Pipelining
    4. Extension Methods
  8. Macros
  9. Handling Parser Exceptions
  10. .NET Interoperability
    1. Accessing .NET Variables From Aphid
    2. Calling .NET Functions From Aphid
    3. Calling Aphid Functions From .NET
    4. Object Interoperability
    5. Seamless .NET Interoperability
  11. Internals
    1. Mantispid: A Powerful Lexer And Parser Generator
    2. Lexical Analysis
  12. Resources
  13. History

Introduction

Aphid is an embeddable, cross-platform, multi-paradigm, and highly interoperable .NET scripting language. It is a C-style language that draws heavily from javascript, but is also inspired by C#, F#, Python, Perl, and others. The Aphid interpreter is implemented entirely in C# and Aphid, with the goal of completely bootstrapping it. This article is intended to be an introduction to Aphid, and as such, only covers some of the features available.

This project is currently in beta, so as it evolves expect this article to change and grow with it. For the most recent version of Aphid, visit the GitHub releases page.

Why Another Scripting Language?

Currently, few easily embeddable scripting languages exist for the .NET platform. Among those available, many have several dependencies, necessitating the inclusion of various assemblies. Still others are lacking in interoperability, requiring inordinate amounts of wire-up code. Aphid seeks to solve these problems by providing an easily embeddable, highly interoperable scripting language contained within a single DLL.

Further, most language implementations are black boxes. Aphid was designed and constructed to be a white box at every layer; from the lexical analyzer to the interpreter, all of internals are exposed through a clean object model. This allows Aphid to serve as a framework for the rapid development of domain-specific languages.

Syntax

As a C-style language that draws heavily from JavaScript, Aphid's syntax should be familiar to most programmers. 

Basics

Statements

In most cases, a statement is an expression followed by a semicolon (;). However, for some statements such as if/else and switch, the semicolon (;) is ommitted.

// A statement terminated by a semicolon.
foo = true;

// An if statement without a semicolon.
if (foo) {
    print('bar');
}

Unlike many other C-style languages, Aphid is not tolerant of superfluous semicolons (;). 

x = 1;; // This semicolon will cause a syntax error.

if (foo) {
    print('bar');
}; // As will this one.

Numbers

Number literals can be written in decimal, hexadecimal, or binary form.

x = 1;

x = 3.14159265359;

x = 0xDEADBEEF;

x = 0b01100000;

Booleans

Boolean values can be specified using the true or false keywords.

x = true;

x = false;

Strings

String literals can be written using either single quotes or double quotes. The backslash character (\) is used in escape sequences.

x = "Hello world";
    
x = 'foobar\r\n';

Lists

Lists are comprised of comma (,) delimited elements enclosed in square brackets ([ ... ]). Trailing commas are allowed.

x = [ 5, 2, 3, 0 ];

List elements can be accessed by indexing via square brackets ([ ... ]).

y = x[1];

Functions

Functions can be called using typical C-style syntax.

func('test');

They can also be called using the pipeline operator (|>).

'test' |> func;

Functions are declared using the function operator (@). Parameters are enclosed in parenthesis (( ... )) and comma (,) delimited.

add = @(x, y) {
    ret x + y;
};

If the body of a function is a single return statement, it can be written more concisely as a lambda expression.

// This is semantically identical to the previous example.
add = @(x, y) x + y;

The parentheses can be ommitted if no parameters are specified.

foo = @{
    /* do something */
};

The function operator (@) is also used to perform partial function application.

add4 = @add(4);

Similarly, the function operator (@) can also be used on binary operators to perform partial operator application.

// This:
add4 = @+ 4;
// Is similar to this:
add4 = @(x) x + 4;

When used in a function expression or partial application, the function operator (@) can be used to create an implicit pipeline.

// This:
4 |> @add(2) |> print;
// Is the same as this:
4 @add(2) print;

The function composition operator (@>) can be used to chain functions together.

halfSquare = square @> divideBy2;
10 |> halfSquare |> print;

Objects

Objects are comprised of comma (,) delimited name-value pairs enclosed in braces ({ ... }). Names and values are separated by a colon (:). Trailing commas are allowed. Nested objects are supported.

widget = {
    name: 'My Widget',
    location: { x: 10, y: 20 },
    data: [ 0xef, 0xbe, 0xad, 0xde ],
};

When referencing a variable in an object initializer, the right hand side of the name-value pair can be omitted if the names match.

data = [ 0xef, 0xbe, 0xad, 0xde ];

widget = {
    name: 'My Widget',
    location: { x: 10, y: 20 },
    data,
};

The member access operator (.) can get used to get and set members.

widget.location.x = 10;
print(widget.location.x);

Members can be accessed dynamically using curly braces ({ ... }).

key = 'location';
widget.{key}.y++;
print(widget.{key}.y);

Null

Null values can be specified using the null keyword.

x = null;

Program Structure

The program structure of an Aphid script depends on the initial lexical analyzer mode. In standard mode, the lexer assumes the file begins with Aphid code and parses it as such. However, when set to document mode, the lexer assumes that the file begins with raw text, and switches between modes upon encountering gator (<% ... %>) or gator emit (<%= ... %>) tokens. During interpretation, text is written to the console unless otherwise specified.

A typical program tokenized in standard mode might look something like this:

#'std';
x = 10;
x *= 2;
print(x);

While a program tokenized in document mode may look something like the following:

Hello <% 
    #'std';
    print('world');
%>

The gator emit (<%= ... %>) tokens can be used to concisely emit expressions.

Hello <%= 'world' %>

Note the absence of a semicolon; an expression is expected after the opening gator emit token, not a statement.

Control Structures

if/else

Conditional statements can be declared using the if and else keywords.

#'Std';
x = 2;

if (x == 1) {
    print('one');
} else if (x == 2) {
    print('two');
} else {
    print('not one or two');
}

The curly braces ({ ... }) can be ommitted if a block is a single statement.

#'Std';
x = 2;

if (x == 1)
    print('one');
else if (x == 2)
    print('two');
else
    print('not one or two');

switch

Aphid's switch syntax differs substantially from the typical C-style switch. Rather than using switch and case keywords, Aphid uses colons (:) and curly braces ({ ... }) to be more consistent with the syntax of other expression types. The default keyword is used to specify case executed when no others match.

#'Std';
x = 20;
switch (x) {
    1: {
        print('One');
    }
    2: {
        print('Two');
    }
    default: print('Default');
}

As with other code blocks, the curly braces ({ ... }) can be ommitted if a block is a single statement.

#'Std';
x = 20;
switch (x) {
    1: print('One');
    2: print('Two');
    default: print('Default');
}

for

The for keyword can be used to declare traditional for loop with an initialization, condition, and afterthought.

#'Std';
l = [ 1, 2, 3 ];

for (x = 0; x < l.count(); x++) {
    print(l[x]);
}

for/in

When used in conjunction with the in keyword, for declares a for-each loop.

#'Std';
l = [ 'a', 'b', 'c' ];

for (x in l) {
    print(x);
}

while

A while loop can be declared using the while keyword.

#'Std';
x = 0;

while (x < 5) {
    x++;
    print(x);
}

Conditional Operator (?:)

The conditional operator (?:) is a ternary operator that can be used to select one of two values based on a condition. The first operand is the condition, the second is the value selected if the condition evaluates to true, and third is the value selected if the condition is false.

#'Std';

x = 10;
print(x == 10 ? 'x is 10' : 'x is not 10');

try/catch/finally

Exception handling statements can be declared using the try, catch, and finally keywords. An exception handling statement must have a try block along with a catch and/or finally block. An optional exception argument can be specified for the catch block.

#'Std';

// A typical try/catch statement.
try {
    1/0;
} catch(e) {
    print(e);
}

// A try/catch statement with a finally block.
try {
    1/0;
} catch(e) {
    print(e);
} finally {
    print('done');
}

// try/catch without an exception argument
try {
    1/0;
} catch {
    print('error');
}

Type System And Types

Aphid uses a multimode type system; duck typing is used when working with built-in Aphid types, while strong typing is used when interoperating with .NET types. The Aphid type system will do limited type coercion that is closer to C#'s than typical scripting languages like JavaScript and PHP. For example, in an expression like 'foo' + 10, the number literal 10 will be automatically converted into a string. However, in most other cases the types of an expression must match. This is to prevent common programming mistakes, and is particularly helpful with boolean expressions.

There are seven built-in types: string, number, boolean, list, object, function, and null. The table below shows how Aphid's types are mapped to .NET's types.

Aphid Type .NET Type
string System.String
number System.Decimal
boolean System.Boolean
list System.Collections.Generic.List<AphidObject>
object AphidObject : System.Collections.Generic.Dictionary<string, AphidObject>
function AphidFunction
null null

Aphid's primitive types as well as list are self explanatory, and any questions can likely answered by the .NET type's MSDN page. The AphidObject class is of particular interest due to its behavior and extensive use. Every type in Aphid, excepting interop .NET types, is an AphidObject. An object's actual value is contained in the AphidObject.Value property, which is a System.Object, while its members are stored as dictionary items. Apart from values, AphidObject is also used to manage lexical scoping. During resolution, the AphidObject.Parent property, which is also of type AphidObject, is used to search outer scopes.

The AphidFunction class is used to represent Aphid functions. As with other functional languages, functions are values in Aphid. In fact, there is not even a function declaration statement syntax; all functions are expressions.

Hello, World

Getting started with Aphid requires minimal setup. First, add a reference to Components.Aphid.dll. Next, instantiate AphidInterpreter. Finally, invoke the instance method AphidInterpreter.Interpret to execute an Aphid script. Painless, huh? A complete C#/Aphid "Hello world" program is shown below, in listing 1.

Listing 1. A simple C#/Aphid hello world program
using Components.Aphid.Interpreter;

namespace HelloWorld
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                #'Std';
                print('Hello, world');
            ");
        }
    }
}

The C# portion of the application should be self explanatory. The Aphid program, however, warrants a bit of an explanation. The program consists of two statements.

The first is a load script statement, consisting of the load script operator (#) and the string operand, 'Std'. By default, the Aphid loader first searches the Library subdirectory of the directory in which Components.Aphid.dll resides. The loader automatically appends the ALX extension to script name passed, so in this instance it looks for <dll>\Library\Std.alx. Assuming everything is in order, it should find and load the file, which is the standard Aphid library, and contains helpful functions for manipulating strings, printing console output, etc.

The second statement is a call expression which invokes print, a function that is part of the Aphid standard library. This line of code should be rather self explanatory.

When the program is run, the output is as expected (listing 2).

Listing 2. Output from the hello world program
Hello, world
Press any key to continue . . .

Lists

The list type behaves much the same as the underlying System.Collections.Generic.List<T> type. Elements can be accessed by index using square brackets ([ ... ]) or enumerated using a for/in statement. Lists are mutable, so they can be modified after creation. It is important to note that lists are reference types, so copies are not made when they are passed around. Modifications to a list will affect all references.

The example below demonstrates several list operations. First, a list is created using square brackets ([ ... ]). As the initialization elements demonstrate, lists can have mixed types. After creation, the add method is called to add another string to the list. Next, the count and contains methods are used to print information about the list. Finally, the a for/in statement is used to enumerate and print the list.

A C#/Aphid program that demonstrates the list type
using Components.Aphid.Interpreter;

namespace ListSample
{
    class Program
    {
        static void Main(string[] args)
        {
            new AphidInterpreter().Interpret(@"
                #'std';
                list = [ 10, 20, 'foo' ];
                list.add('bar');

                printf(
                    'Count: {0}, Contains foo: {1}',
                    list.count(),
                    list.contains('foo'));
                
                for (x in list)
                    print(x);
            ");
        }
    }
}

Functions

Aphid functions are defined by using the function operator (@). Since functions are first-class citizens in Aphid, they can be stored in variables (listing 3).

Listing 3. A C#/Aphid program that defines and invokes an Aphid function
using Components.Aphid.Interpreter;

namespace FunctionSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();

            interpreter.Interpret(@"
                #'Std';

                add = @(x, y) {
                    ret x + y;
                };

                print(add(3, 7));
            ");
        }
    }
}

The output of the program is shown in listing 4.

Listing 4. Output from the function sample
10
Press any key to continue . . .

Our add function is nice, but it could be made more concise with a language feature that might be familiar to some: lambda expressions.

Lambda Expressions

Aphid lambda expressions are special functions that are formed from a single expression. When a lambda expression is invoked, the expression is evaluated and the value is returned.

Since the body of the add function from the previous example consists of a single return statement, it can be refactored into a lambda expression (listing 5).

Listing 5. A C#/Aphid program that defines and invokes an Aphid lambda expression
using Components.Aphid.Interpreter;

namespace LambdaSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();

            interpreter.Interpret(@"
                #'Std';
                add = @(x, y) x + y;
                print(add(3, 7));
            ");
        }
    }
}

The output of the program is shown in listing 6.

Listing 6. Output from the lambda sample
10
Press any key to continue . . .

Higher-order Functions

Aphid functions are values. This makes higher-order functions (i.e. functions that accept and/or return other functions) possible. Listing 7 shows a higher-order Aphid function.

Listing 7. A C#/Aphid program that defines and invokes a higher-order Aphid function
using Components.Aphid.Interpreter;

namespace HigherOrderFunctionSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();

            interpreter.Interpret(@"
                #'Std';
                call = @(func) func();
                foo = @() print('foo() called');                
                call(foo);
            ");
        }
    }
}

The output of the program is shown in listing 8.

Listing 8. Output from the higher-order function sample
foo() called
Press any key to continue . . .

Partial Function Application

Partial function application can be used to apply arguments to a given function, producing a new function that accepts the remaining, unapplied arguments. Partial function application is performed using the function operator (@). Listing 9 shows a function named add that is partially applied with the number literal 10 to produce a new function, add10. Listing 10 shows the output of the sample program.

Listing 9. A C#/Aphid program that demonstrates partial function application
using Components.Aphid.Interpreter;

namespace PartialFunctionApplicationSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                #'Std';
                add = @(x, y) x + y;
                add10 = @add(10);
                print(add10(20));
            ");
        }
    }
}
Listing 10. Output from the partial function application sample
30
Press any key to continue . . .

Pipelining

Pipelining, done with the pipeline operator (|>), offers an alternate syntax for calling functions. Listing 11 shows and example of pipelining, while listing 12 shows the output.

Listing 11. A C#/Aphid program that demonstrates partial function application
using Components.Aphid.Interpreter;

namespace PipeliningSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                #'Std';
                square = @(x) x * x;
                cube = @(x) x * x * x;
                2 |> square |> cube |> print;
            ");
        }
    }
}
Listing 12. Output from the pipelining sample
64
Press any key to continue . . .

Functions that accept multiple parameters can be included in pipelines through the use of partial function application (listing 13).

Listing 13. A C#/Aphid program that demonstrates pipelining and partial function application
using Components.Aphid.Interpreter;

namespace PipeliningSample2
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                #'Std';
                square = @(x) x * x;
                cube = @(x) x * x * x;
                add = @(x, y) x + y;
                2 |> square |> cube |> @add(4) |> print;
            ");
        }
    }
}

Extension Methods

Extension methods can be used to add methods to the built in types. This is done with the extend keyword (listing 14). When an extension method is called, the instance the method is called from is passed as the first argument (l in the example below).

Listing 14. A C#/Aphid program that demonstrates extension methods
using Components.Aphid.Interpreter;

namespace ExtensionSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                #'std';

                extend number {
                    square: @(l) l * l
                }

                x = 10;

                print(x.square());
            ");
        }
    }
}

Macros

Aphid macros differ quite a bit from those of languages like C. Unlike the C preprocessor, Aphid's preprocessor uses the same parser as the core language, so macros must abide by the lexical and syntactic conventions of the language. Rather than performing simple string manipulation, Aphid macros operate on the abstract syntax tree (AST). This makes Aphid's macros a bit more heavyweight than C macros, but at the same time it eliminates many of the pitfalls and gotchas associated with them.

Macros can be used at both the expression and statement level. They are expanded recursively, meaning they can be nested.

A C#/Aphid program that demonstrates macros
using Components.Aphid.Interpreter;

namespace MacroSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();

            interpreter.Interpret(@"
                #'std';
                m1 = macro(@{ 'Hello world' });
                m2 = macro(@(msg) { print(msg) });
                
                m3 = macro(@{
                    m2('foobar');
                    m2(m1());
                });
                
                m3();
            ");
        }
    }
}

As the example shows, macros can have parameters and can be used in lieu of almost any expression or statement type. Note that care must be taken when using macros that expand into statements. If an expression is expected and a macro expands into statements, an exception will occur.

Handling Parser Exceptions

The ParserErrorMessage.Create helper method can be used to convert an AphidParserException into a friendly error message. This is especially helpful because syntax errors happen, and it's convenient to know what caused the error and where to begin looking. 

The example below attempts to execute an Aphid script that deliberately contains a syntax error in the argument passed to print. The syntax error is caught using a C# try/catch block, converted to a string using ParserErrorMessage.Create, then written to the console.

A C#/Aphid program that demonstrates friendly parser errors
using Components.Aphid.Interpreter;
using Components.Aphid.Parser;
using System;

namespace ErrorHandlingSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var code = @"
                #'std';
                print('foo'bar');
            ";

            try
            {
                var interpreter = new AphidInterpreter();
                interpreter.Interpret(code);
            }
            catch (AphidParserException e)
            {
                var msg = ParserErrorMessage.Create(code, e);
                Console.WriteLine(msg);
            }
        }
    }
}
Output from the friendly parser error sample
Unexpected identifier bar on line 2

(1)                 #'std';
(2)                 print('foo'bar');
(3)


Press any key to continue . . .

The Aphid scripts shown so far haven't really interacted with their host language. Let's take a look at some of the interoperability features.

.NET Interoperability

Aphid offers two approaches to interoperating with .NET. The explicit approach is described first, and involves working directly with AphidObject along with decorating .NET methods and properties with attributes. Later versions of Aphid added implicit interop, which is described at the end of this section.

Accessing Aphid Variables from .NET

Getting and setting Aphid variables from .NET is done by accessing the CurrentScope property of an AphidInterpreter instance. CurrentScope is nothing more than an AphidObject, which is itself derived from Dictionary<string, AphidObject>. An example of getting an Aphid variable is shown in listing 15, and setting a variable is shown in listing 16.

Listing 15. An interop program that demonstrates getting an Aphid variable with C#
using Components.Aphid.Interpreter;
using System;

namespace VariableGetSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            interpreter.Interpret("x = 'foo';");
            Console.WriteLine(interpreter.CurrentScope["x"].Value);
        }
    }
}
Listing 16. An interop program that demonstrates setting an Aphid variable with C#
using Components.Aphid.Interpreter;
using System;

namespace VariableSetSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            interpreter.CurrentScope.Add("x", new AphidObject("foo"));

            interpreter.Interpret(@"
                #'Std';
                print(x);
            ");
        }
    }
}

Calling .NET Functions From Aphid scripts

Exposing .NET functions to Aphid is quite simple. First, the .NET function of choice must be decorated with AphidInteropFunctionAttribute (listing 17). The constructor of AphidInteropFunctionAttribute accepts a string that specifies the name of the function as seen by Aphid. This can be a simple identifier (e.g. foo) or a member access expression (e.g. foo.bar.x.y). If it is the latter, Aphid will either construct an object or add members to an existing object as necessary when the function is imported.

Listing 17. An Aphid interop function
using Components.Aphid.Interpreter;

namespace InteropFunctionSample
{
    public static class AphidMath
    {
        [AphidInteropFunction("math.add")]
        public static decimal Add(decimal x, decimal y)
        {
            return x + y;
        }
    }
}

Note that the decorated function is both public and static; this is a requirement for all Aphid interop function. Now that we've created our interop function, we can proceed to write a script that imports and invokes it (listing 18).

Listing 18. An Aphid program that demonstrates loading a library and invoking an interop function
#'Std';
##'InteropFunctionSample.AphidMath';
print(math.add(3, 7));

The first line, the load script statement, was described in the Hello, world section. The second, however, is slightly different. The load library operator (##) searches Aphid modules (more on this in a bit) for the class specified by the string operand. You may recognize that the operand is the fully qualified name of the container class for the interop function we wrote previously.

So how does the Aphid interpreter know where to find InteropFunctionSample.AphidMath, you ask? Simple: we add the appropriate .NET assembly to the Aphid loader's list of modules. The relevant code is shown in listing 19.

Listing 19. An C#/Aphid program that demonstrates loading a library and invoking an interop function
using Components.Aphid.Interpreter;
using System.Reflection;

namespace InteropFunctionSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interprer = new AphidInterpreter();
            interprer.Loader.LoadModule(Assembly.GetExecutingAssembly());
            
            interprer.Interpret(@"
                #'Std';
                ##'InteropFunctionSample.AphidMath';
                print(math.add(3, 7));
            ");
        }
    }
}

When run, the application yields the expected output (listing 20).

Listing 20. Output from the interop sample
10
Press any key to continue . . .

Now, let's flip things around.

Calling Aphid Functions From .NET

In some scenarios, you may find yourself needing to invoke Aphid functions from .NET code. This can be achieved by calling the AphidInterpreter.CallFunction instance method, which accepts a function name and the arguments to be passed (listing 21).

Listing 21. A C#/Aphid program that demonstrates calling an Aphid function from .NET
using Components.Aphid.Interpreter;
using System;

namespace CallAphidFunctionSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            interpreter.Interpret("add = @(x, y) x + y;");
            var x = interpreter.CallFunction("add", 3, 7).Value;
            Console.WriteLine(x);
        }
    }
}

When run, the Aphid function add is called (listing 22).

Listing 22. Output from the interop sample
10
Press any key to continue . . .

Object Interoperability

Some scenarios may necessitate passing objects back and forth between Aphid and .NET. This can be done by manually creating and manipulating instances of AphidObject, or by using the AphidObject.ConvertTo<t></t> and AphidObject.ConvertFrom methods (listing 23). Class properties intended to be passed between Aphid and .NET via ConvertTo and ConvertFrom should be decorated with AphidPropertyAttribute.

Listing 23. A C#/Aphid program that demonstrates Aphid object interoperability
using Components.Aphid.Interpreter;
using System;

namespace ObjectSample
{
    public class Point
    {
        [AphidProperty("x")]
        public int X { get; set; }

        [AphidProperty("y")]
        public int Y { get; set; }

        public override string ToString()
        {
            return string.Format("{0}, {1}", X, Y);
        }
    }

    public class Widget
    {
        [AphidProperty("name")]
        public string Name { get; set; }

        [AphidProperty("location")]
        public Point Location { get; set; }

        public override string ToString()
        {
            return string.Format("{0} ({1})", Name, Location);
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                #'Std';
                
                ret {
                    name: 'My Widget',
                    location: { x: 10, y: 20 }
                };
            ");

            var widget = interpreter.GetReturnValue().ConvertTo<pWidget>();
            Console.WriteLine(widget);
            widget.Location.X = 40;
            var aphidWidget = AphidObject.ConvertFrom(widget);
            interpreter.CurrentScope.Add("w", aphidWidget);
            interpreter.Interpret(@"printf('New X value: {0}', w.location.x);");
        }
    }
}
Listing 24. Output from the interop sample
My Widget (10, 20)
New X value: 40
Press any key to continue . . .

Seamless .NET Interoperability

More recent versions of Aphid have the added ability to interoperate with .NET without any wireup or glue code. While these capabilities are not yet complete, it is already possible to instantiate and manipulate classes, use some generic types, and perform many other operations.

A using keyword reminiscient of C#'s has been added. It can be used to import the specified namespace into scope. Static methods of .NET classes can be used as first class citizens in Aphid code. They can be invoked, passed as values, partially applied, etc.

A C#/Aphid program that demonstrates Aphid's seamless .NET static method interop capabilities
using Components.Aphid.Interpreter;

namespace SeamlessInteropSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();
            
            interpreter.Interpret(@"
                using System;
                Console.WriteLine('Hello world');
                print = Console.WriteLine;
                print('{0}', 'foo');
                printBar = @Console.WriteLine('{0}bar');
                printBar('foo');
            ");
        }
    }
}

As the sample demonstrates, the Aphid runtime resolves overloads even when partial application is used.

Instantiation of .NET classes can be performed using the unary operator new. Overloaded constructors are supported.

A C#/Aphid program that demonstrates Aphid's seamless .NET class interop capabilities
using Components.Aphid.Interpreter;

namespace SeamlessInteropSample2
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();

            interpreter.Interpret(@"
                using System;
                using System.Text;
                sb = new StringBuilder('Hello');
                sb.Append(' world');
                
                Console.WriteLine(
                    'Length={0}, Capacity={1}', 
                    sb.Length,
                    sb.Capacity);

                sb |> Console.WriteLine;
            ");
        }
    }
}

.NET instances are first class citizens in Aphid, and their members are accessed using the standard syntax.

The load operator has been added to allow Aphid scripts to load .NET assemblies at runtime. Assemblies are specified by partial name, which is generally the filename without the dll extension. In the following example, System.Web.Extensions.dll is loaded using the load operator.

A C#/Aphid program that demonstrates Aphid's seamless .NET assembly interop capabilities
using Components.Aphid.Interpreter;

namespace SeamlessInteropSample3
{
    class Program
    {
        static void Main(string[] args)
        {
            var interpreter = new AphidInterpreter();

            interpreter.Interpret(@"
                load System.Web.Extensions;
                using System;
                using System.Web.Script.Serialization;
                serializer = new JavaScriptSerializer();
                
                obj = 
                    '{ ""x"":52, ""y"":30 }' 
                    |> serializer.DeserializeObject;
                
                Console.WriteLine(
                    'x={0}, y={1}',
                    obj.get_Item('x'),
                    obj.get_Item('y'));
            ");
        }
    }
}

Internals

Aphid's language implementation is divided into three layers: lexical analysis, parsing, and interpretation. Each layer represents a stage in the execution of an Aphid program. The first stage, lexical analysis, reads the script file and tokenizes it. The parser then takes the sequence of tokens and uses it to construct an abstract syntax tree (AST). Finally, the interpreter walks the AST, maintaining state and executing each node along with way.

Mantispid: A Powerful Lexer And Parser Generator

Aphid is built using a custom language implementation toolchain. Both the lexical analyzer and the recursive descent parser are generated using a tool named Mantispid, which is itself an alternate frontend for Aphid. Mantispid accepts a script as input, and outputs a lexical analyzer and parser along with supporting classes in a single C# file.

To use Mantispid, build it from source using Visual Studio. The project can be found in the \Mantispid folder. Once it's built, run it from the command line without any arguments.

C:\source\Aphid\Mantispid\bin\Debug>Mantispid.exe
mantispid [Parser Script] [Output File]

C:\source\Aphid\Mantispid\bin\Debug>

With Mantispid built, Aphid's lexer and parser can be regenerated from the input file located at \Components.Aphid\Aphid.alx, and output to \Components.Aphid\Parser\AphidParser.g.cs.

C:\source\Aphid\Components.Aphid>..\Mantispid\bin\Debug\Mantispid.exe Aphid.alx Parser\AphidParser.g.cs
Parsing input file
Generating parser
Parser written to 'Parser\AphidParser.g.cs'

C:\source\Aphid\Components.Aphid>

The generated C# file is quite large, but moderately human readable after formatting it using Visual Studio (Ctrl+k, Ctrl+d). Fortunately, there's no need to work directly with it; apart from a handful of partial classes, changes to the lexer and parser are made exclusively through the Mantispid input files.

Lexical Analysis

The lexical analyzer can be found in \Components.Aphid\Aphid.Lexer.alx. An abridged version is below.

Lexer({
    init: @() {
        #'Aphid.Lexer.Tmpl';
        #'Aphid.Lexer.Code';
    },
    name: "Components.Aphid.Lexer.Aphid",
    modes: [ 
        {
            mode: "Aphid",
            tokens: [
                { regex: '%>', type: 'GatorCloseOperator', newMode: "Text" },

                { regex: "#", type: "LoadScriptOperator" },
                { regex: "##", type: "LoadLibraryOperator" },

                { regex: ",", type: "Comma" },
                { regex: ":", type: "ColonOperator" },
                { regex: "@", type: "functionOperator" },
                { regex: "@>", type: "CompositionOperator" },
                
                /* ... */

                { regex: ";", type: "EndOfStatement" },

                { regex: "\\r|\\n|\\t|\\v|\\s", type: "WhiteSpace" },
                { code: idCode },
                { regex: "0", code: getNumber(
                    'NextChar();\r\nstate = 1;', 'return AphidTokenType.Number;') },
                { regex: "0x", code: zeroXCode },
                { regex: "0b", code: zeroBCode },
                { code: getNumber(
                    'state = 0;', 
                    'if (state == 1 || state == 3 || state == 5) { return AphidTokenType.Number; }') },
                getString('"'),
                getString("'"),
                { regex: "//", code: singleLineCommentCode },
                { regex: "/\\*", code: commentCode }
            ],
            keywords: [
                "true",
                "false",
                "null",

                /* ... */

                "try",
                "catch",
                "finally",
            ],
            keywordDefault: getKeywordHelper('Identifier'),
            keywordTail: getKeywordHelper('{Keyword}')            
        },
        {
            mode: "Text",
            tokens: [
                { regex: '<%', type: 'GatorOpenOperator', newMode: "Aphid" },
                { regex: '<%=', type: 'GatorEmitOperator', newMode: "Aphid" },
                { regex: '<', code: textCode },
                { code: textCode },
            ]
        },
    ],
    ignore: [ 
        "WhiteSpace",
        "Comment" 
    ]
});

Apart from some templatized helper code, Aphid's lexical analyzer is described in its entirety in Aphid.Lexer.alx as an object expression passed to the special Lexer function. Each property of the object expression has domain-specific semantics that direct Mantispid in constructing the lexer.

The first property of the object, init, is a function that is called when lexer generation begins. In this case, it is used to load templatized code in the form of external code files.

The second property, name, is an string used to construct the class name of the lexer by suffixing the value with "Lexer".

The third property, modes, is a list that contains the bulk of the lexer declaration. Each element of the list is an object that represents a mode for the lexical analyzer. Modes have a name that is specified by the mode property, and a list named tokens. The elements of tokens are objects that use various properties to declare different tokens for the current lexer mode.

Token objects can have the following properties: regex, type, code, and newMode. The regex property specifies a limited regular expression used to match a token. Note that only a small subset of regular expressions is currently supported. The type property specifies the type of token being matched. The code property can be used in place of type to specify a block of code that is executed when regex is matched, or when no tokens are matched if regex is not defined. The newMode property is used to specify a mode to transition to when a token is matched.

Following the tokens property is keywords, which is a string list that defines the keywords of the Aphid language.

Next is the keywordDefault property, which specifies code to be run when a keyword match fails. The reason for this is that a stream of characters may initially appear to be a keyword, only to fail part of the way through the match. In many languages the characters up until that point may form a valid identifier, and not only that, but scanning must continue to ensure that all matchable characters are taken, as per the maximal munch prinicple. An example of this case would be the identifier cat, which partially matches with the Aphid keyword catch. An example that demonstrates the need to continue scanning would be the identifier catastrophe, which also partially matches with catch. the keywordDefault property provides a solution to this common problem.

After keywordDefault comes the keywordTail property, which behaves similarly to once again apply the maximal munch principle. In many languages, when a keyword is matched, scanning must continue to ensure the token is not merely an identifier that starts with the matched keyword. An example of this would be the identifier catchable, which starts with the keyword catch. The keywordTail property is intended to handle this case with a templatized block of code.  To reference the fallback keyword in the template, the {Keyword} token is used.

Finally, the ignore property is a string list of token types that should be dropped automatically during tokenization.

Mantispid's lexical analysis generation is complicated, but the result is a high performance lexer with a clean object model. Tokenizing strings with the generated lexer requires very little code. The program below instantiates AphidLexer, uses it to tokenize a simple program, then dumps the tokens to the console.

using Components.Aphid.Lexer;
using System;

namespace LexerSample
{
    class Program
    {
        static void Main(string[] args)
        {
            var lexer = new AphidLexer(@"
                using System;
                Console.WriteLine('Hello world');
            ");

            foreach (var t in lexer.GetTokens())
            {
                Console.WriteLine(t);
            }
        }
    }
}

The output looks like this:

[18] usingKeyword: using
[24] Identifier: System
[30] EndOfStatement: ;
[49] Identifier: Console
[56] MemberOperator: .
[57] Identifier: WriteLine
[66] LeftParenthesis: (
[67] String: 'Hello world'
[80] RightParenthesis: )
[81] EndOfStatement: ;
Press any key to continue . . .

Resources

History

  • 4/22/2018 - Numerous fixes and improvements
    • Improved support for sharing resources across thread-boundaries.
    • Direct type references are now supported, so typeof(StringBuilder) can instead be written as StringBuilder. This feature was added while retaining support for both System.Type instance members and type declared static members. For example, File.GetMethods() works as expected, as does File.ReadAllText(),
    • Added support for explicit generics, so classes like List and Dictionary<TKey, TValue> can be instantiated without using reflection.
    • Extensive type conversion updates, including the ability to automatically pass Aphid functions as .NET delegates without any explicit type information.
    • Rewrote type system, adding powerful type inference with support for generics.
    • Major updates to REPL, including syntax highlighting and extensive autocomplete.
    • Overhauled extension support with static extension methods, extension properties, extension constructors, and dynamic extension members.
    • Added shell support, with command-line style syntax, PowerShell Cmdlet interop, and remoting.
    • C#-style using statements are now supported for handling IDisposable instances.
    • Added support for hundreds of new custom operators.
    • Support for block-level performance profiling.
    • Optional strict mode that requires variables be explicitly declared using the var attribute.
    • Replaced Common Compiler Infrastructure-based ILWeave with Medusa-based view-model compiler for far more concise view-model definitions.
    • Added remoting, with ability to serialize abstract syntax trees and lexical state to send over the wire.
    • Added tools to build executables without Visual Studio or other external tools.
    • Added Medusa, powerful new metaprogramming/white-box language system.
    • Improved runtime error checking.
    • Several fixes and updates to serialization.
    • Improved array interop support.
    • Added debugging and syntax highlighting extension for Visual Studio Code.
    • Several updates to Visual Studio Plugin, including support for Visual Studio 2013 to 2017.
    • Numerous fixes to lexical scope support.
    • Major performance improvements due to optimizations that include rewritten hot paths and type memoization.
  • 4/16/2016 - Mostly article updates along with a few bug fixes.
    • Added table of contents.
    • Added list documentation.
    • Added parser error documentation.
    • Type system and type documentation updated.
    • Syntax documentation updated.
    • Started internals documentation.
    • Fixed Aphid document support.
    • Reordered sections.
  • 4/11/2016 - Far too many changes to list, but some of major updates are below.
    • Syntax documentation updated.
    • Macro documentation added.
    • Seamless .NET interop documentation added.
    • Added macro support.
    • Added partial operator application.
    • Added seamless .NET interop.
    • Added Visual Studio plugin.
    • Added implicit pipeline syntax.
    • Added compiler frontends that output Python, PHP, and Verilog.
    • Added parser generator frontend, used it to bootstrap Aphid parser.
    • Added stack trace support
    • Added Aphid document suport.
    • Added binary number support.
    • Improved parser error messages.
    • Added unit tests.
    • Numerous bug fixes.
  • 11/28/2013 - Several updates and fixes to code.
    • Added switch support.
    • Added range operator.
    • Added conditional operator.
    • Added query operators.
    • Added prefix/postfix increment/decrement support.
    • Added num function.
    • Added env.processes function.
    • Added UDP library.
    • Added ILWeave tool.
    • Added WPF AphidRepl Control.
    • Several updates to REPL.
    • Replaced exists operator with defined keyword.
    • Fixed number literal tokenization issue.
    • Fixed AphidObject.ConvertFrom number conversion bug.
    • Fixed loader issues.
    • Fixed string.substring extension method.
  • 11/07/2013 - Updated article to cover control structures, partial function application, pipelining, extension methods, and object interoperability.
  • 11/06/2013 - Added try/catch/finally support, added while loop support, added * assignment (+=, -=, etc) support, fixed serialization, added interop functions, fixed negative number literal support.
  • 11/05/2013 - Added examples for basic types, added download link.
  • 10/16/2013 - First version of this article.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here