Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

C# Expression Parser using RPN

0.00/5 (No votes)
17 Jan 2004 1  
Design & implementation of an Expression Parser using RPN in C#.

Introduction

The aim of this article is to demonstrate an interface-based programming paradigm in the design and implementation of a simple Expression Parser & Evaluator using RPN in C#. The intended audience skill set would be Beginner to Intermediate.

Background

For introduction about Expressions & Reverse Polish Notation, refer to this article - Expression Evaluator: using RPN by lallous, in the C++/MFC section.

Design Description

(All code referred here is inside RPNParser.cs file.)

Interfaces

Every expression is made of operands and operators. These are reflected in the interfaces - IOperand and IOperator.

  • IArithmeticOperations defines methods for binary arithmetic operators (+, -, *, \, %).
  • IComparisonOperations defines methods for all comparison operators (>, >=, <, <=, ==, !=).
  • ILogicalOperations defines methods for logical operators (&& , ||).

The idea behind these interfaces is as follows -

Each Operand supports a set of Operations and the Operators evaluate a pair of Operands using the Operations supported by the two Operands.

E.g. A Long Operand supports Arithmetic and Comparison Operations, whereas a String might support only Comparison Operations. Boolean Operands support only the Logical Operations.

An Operator like + needs only to know that it has to call the Plus method on an IOperand implementing the IArithmeticOperations interface. This keeps the algorithm of expression parsing & evaluation (in this case the RPN sequencing) separated from the data type that does the actual evaluations and hence makes it very extensible to newer data types.

Operands

Operand class is an abstract base class implementing the IOperand interface and additionally providing data storage.

LongOperand is an Operand class specific to Long & Int (Int64 and Int32) types of operands and implements the IArithmeticOperations, IComparisonOperations interfaces.

BoolOperand is an Operand class specific to bool type of operands and implements only the ILogicalOperations interface.

OperandHelper is a static class providing object factory functionality to create appropriate Operand objects based on the Type passed to it (OperandHelper.CreateOperand).

For new types of Operands like Decimal/String or user-defined types, e.g. a Matrix, this factory method would have to be extended.

Operators

IOperator defines one method called Eval and is used to evaluate the LHS and RHS of the expression.

Operator is an abstract base class implementing this interface and provides data storage for the operator.

ArithmeticOperator, ComparisonOperator, LogicalOperator are operator classes to support IArithmeticOperations, IComparisonOperations, ILogicalOperations respectively.

OperatorHelper is a helper class providing object factory functionality for creation of appropriate Operator objects (OperatorHelper.CreateOperator). This class provides services to determine operator-precedence. This is based on the relative indexes of the operators defined in the m_AllOps variable. It also provides Regular Expressions used during parsing of the input expression string. E.g. The regex [=<>!]{1,2} is used to parse for Comparison operators.

Expression Parsing

The Tokenizer class provides functionality of parsing the input expression string using regular expressions.

The TokenEnumerator class supports the IEnumerator interface and is used to loop through the various tokens in the input expression string. The tokens in the expression are returned to the caller as Token objects.

The given expression can be parsed as either Arithmetic or Logical or Comparison Expression Types. This is controlled by the enums ExpressionType::ET_ARITHMETIC, ExpressionType::ET_COMPARISON and ExpressionType::ET_LOGICAL. A combination of these enum types can also be given.

E.g. To parse the expression as all of these, pass ExpressionType.ET_ARITHMETIC| ExpressionType.ET_COMPARISON|ExpressionType.ET_LOGICAL to the Tokenizer c'tor.

The generation of the RPN sequence is in the RPNParser::GetPostFixNotation method and it uses the algorithm as described in the above mentioned C++ article. The return value is an ArrayList consisting of Operands and Operators in the RPN sequence.

This method takes an argument bFormula that determines if the expression being parsed is a formula (in the form of x+y*z/t) or has direct values (3+6*7/2). If the expression is not a formula, then the values are extracted by this method and stored in the Token object for later evaluation. If the expression is a formula, then the values for the corresponding variables need to be passed separately as a hashtable for evaluation.

Expression Evaluation

RPNParser::EvaluateRPN is the method performing actual evaluation of the RPN generated by the RPNParser::GetPostFixNotation method and is again based on the algorithm described in the above mentioned C++ article. This method takes as input the ArrayList with the RPN sequence and if the expression is a formula, a hashtable with values for the variables in the expression.

Usage

A typical usage of the parser would be as follows -

RPNParser parser = new RPNParser();

string szExpression = @"((2+3)*2<10 || 1!=1) && 2*2==4";

parser.EvaluateExpression( szExpression, 
    Type.GetType("System.Int64" ), false, null );

The attached code contains a small form-based application to test the logic.

It uses the RPN_Parser::Convert2String() method to get a string representation of the RPN sequence.

Extensibility

The Parser can be easily extended to support new Operand types and Operators by subclassing the Operand and Operator classes. Since the Parsing and Evaluation logic works totally on interfaces, typically nothing would have to be changed there. Of course, the factory classes need to be extended to support the new types and if adding new operators, the regular expressions used & their precedence information present inside the Operatorhelper class would have to be modified accordingly.

Hope someone out there finds this useful.

Current Shortcomings

  1. Unary operators (+,-,!) not handled.
  2. Does not yet support expressions having mixture of variables and values (e.g. x+y/3*z%2 ).
  3. Bitwise & and | not handled.
  4. Does not handle a combo expression with multiple braces as just one of the ExpressionTypes?

    // E.g. ((2+3)*2<10 || 1!=1) && 2*2==4 as just a logical expression cannot be done.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here