Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Java

Java Parameterised Expression Evaluator

4.93/5 (10 votes)
20 Mar 2013CPOL4 min read 29.3K   615  
A paramaterized expression evaluator in Java.

Introduction

An expression evaluator is a process that takes a human readable expression, in this case a string, and evaluates the result. In this case, the expression evaluator can take in a set of parameters and use them as additional input.

Background

I was using various ad-hoc techniques and free libraries for this; but none actually did what I wanted. My two requirements were that it should parse it once and be able to evaluate multiple times based on a set of parameters. I know, I'll build my own!

My first step was to look at how an expression would be evaluated. Ignoring the syntax and how I could tokenize the words, I looked instead at the end product. The design I came up with was that each action is represented by a separate class and a hierarchy would be built where one simple expression can call its children until they all evaluate.

With the output structure in place it became a simple exercise of going from the input to the output. A hierarchy was built, based on order of precedence of how each different type of expression is evaluated and the conversion from string to expressions was placed in a factory.

The class structure for building an expression is:

Class structure for lxExpression

The parameters to execute an expression are passed in using a generic DataSet.

Class Structure

The Expression Parser

The hardest part, for me, in writing this was in working out how to build the components from the input string. The execution was easy, it was building the expression in the first place that proved difficult. Eventually after much deliberation, I found a syntax that I like.

EXPRESSION    ::= ID "=" ELEMENT
                | LOGICAL_OR
LOGICAL_OR    ::= LOGICAL_AND "||" LOGICAL_OR
                | LOGICAL_AND
LOGICAL_AND   ::= COMPARE "&&" LOGICAL_AND
                | COMPARE
COMPARE       ::= SUM "<" SUM
                | SUM "<=" SUM
                | SUM "==" SUM
                | SUM "!=" SUM
                | SUM ">=" SUM
                | SUM ">" SUM
                | SUM
SUM           ::= TERM "+" SUM
                | TERM "-" SUM
                | TERM
TERM          ::= FACTOR "*" TERM
                | FACTOR "/" TERM
                | FACTOR
FACTOR        ::= PRIMARY "^" FACTOR
                | PRIMARY "%" FACTOR
                | PRIMARY
PRIMARY       ::= "-" ELEMENT
                | "!" ELEMENT
                | ELEMENT
ELEMENT       ::= ID
                | CONSTANT
                | "(" EXPRESSION ")"

It is only the last item ELEMENT that needs to be explained. It is either another expression in brackets, or an CONSTANT or ID. Constants are numbers and quoted strings, leaving IDs as unquoted strings. These are the parameters that will be taken from the input when evaluating the expression

With this structure, I was able to construct the class ExpressionParser which builds the expressions from the string. This is done in two stages, first the expression is tokenized, to find the words and what type they each are. Then the expression objects are built in exactly the structure as shown above.

As an example, here is the method that parses the top level expression:
Java
/**
 * Parse the text for an expression:
 * <pre>
 * EXPRESSION ::= ID "=" ELEMENT
 *              | LOGICAL_OR
 * </pre>
 *
 * @param start the word at which to start parsing.
 *
 * @return an {@see Expression} and the index of the last word.
 */
private ReturnExpression expression(int start)
        throws ExpressionException {
    if (this.getType(start) == WordType.LITERAL &&
            this.getType(start + 1) == WordType.ASSIGN) {
        ReturnExpression ex = element(start+2);
        return new ReturnExpression(ex.end, new Assign(this.getWord(start),ex.expression));
    }
    return logicalOr(start);
}

None of the corresponding methods are any more complex. Separating evaluation from parsing the tokenized expression created a much simpler process and also allows the expression to be repeatedly evaluated without needing to parse it once more; thus giving the second requirement.

Expressions

The expression classes again went through severe metamorphosis from when I started. Each action is represented by it's own class and there is no state except for the structure of the expressions themselves. This results in a lot of small and simple classes; I like simple.

At the base is the class Expression, this is abstract and contains one simple method signature. This is the method that is called to evaluate any expression, again it has been made simple and it works. In implementing the class, an overriding methodology was that if the expression does not make sense, it should return null rather than throw an exception. Exceptions are thrown, but kept to a minimum to reduce pressure on up-stream usage.

There are two more classes of interest that are used in evaluating combined expressions. Firstly MultipleExpression extends Expression and provides support for processing the results of more than one expression and bringing them together. This in turn is extended by Numeric to provide the basis for handling numeric expressions.

Using the code

The premise of this is that an expression is parsed and then can be evaluated one or more times. Along with the lxExpression library, I have included a test application that just runs expressions through the evaluator. The tester shows how to parse and evaluate expressions.

This tool uses a file representing a DataSet that is read and then evaluated. The format of the file is simply a list of tests to run and then a block for each test:

test - <list of names>
[<name> {
  expression - <expression>
  data {
    <requred data>
  }
}
...]

The code that actually runs a test is as simple as everything that has passed before. Build it, parse it and evaluate it.

Java
/**
 * Perform a test on the expression evaluator.
 * <pre<
 * expression - <expression>
 * data {
 *   <requred data>
 * }
 * </pre<
 * @param test a dataset containing a test.
 */
private static void test(DataSet test) {
    String expression = test.getString("expression");
    DataSet data = test.getDataSet("data");
    System.out.println("Expression: " + expression);
    System.out.println("Data: " + data);
    try {
        ExpressionParser ep = new ExpressionParser();
        ep.setExpression(expression);
        ep.parse();
        DataSet clone = new DataSet(data);
        Expression ex = ep.getExpression();
        System.out.println("Ex: " + ex);
        Object res = ex.evaluate(clone);
        System.out.println("Result: " + res);
        System.out.println("Data: " + clone);
    } catch (ExpressionException ex) {
        ex.printStackTrace();
    }
    System.out.println("---");
    System.out.println("");
}

There's not a great deal more to say about this. I now have this library to use in place of a lot of other tokenizing and evaluation code.

Points of Interest

There are a few things I have, half intentionally, left out for now. There are needed and I will have to get around to them soon; but as it stands I can carry on without them.

  • Support an if/then/else construct.
  • There is no support for functions. I want some basic math - such as sin, cos, tan - as well as string manipulation and casting. Eventually this should include user (or at least application) defined functions.
  • Dates do not work properly and I just can't get around to testing them; use them at your own risk.

History

  • 2013-03: Initial submission.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)