Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

A Simple, Infix to Reverse Polish Notation Transformer, Written in C#

0.00/5 (No votes)
10 Mar 2009 1  
Transforms a mathematical expression from Infix notation to Reverse Polish notation

Introduction

The class presented in this article can be used to transform a mathematical expression from infix notation to Reverse Polish notation. By default, there is no validation of the expression, but some simple validation may be turned on through an Options parameter.

See Shunting yard algorithm for more information.

Background

I needed this for the ParseInfix method for my Rational type.

The class exposes only one public method: the InfixToRpn method.

string result = PIEBALD.Lib.InfixToRpn
(
    source
, 
    PIEBALD.Lib.InfixToRpn.RpnOptions.None
) ;

string result = PIEBALD.Lib.InfixToRpn
(
    source
, 
    PIEBALD.Lib.InfixToRpn.RpnOptions.FailOnMalformedExpression
) ;

The mathematical operators (+, -, *, /, %, ^ (exponentiation)) and parentheses are supported.

Verbatim text (which can be used for terms in Scientific notation) can be specified between the ' (ASCII 96) and ' delimiters.

Supporting Characters

The following items are also in the file:

RpnOptions

This enumeration is used to provide the user the ability to control some optional features of the InfixToRpn method.

RetainWhiteSpace

Retain the whitespace, control characters, and null characters from the source string.

FailOnMismatchedParenthesis

Throw an InfixToRpnFormatException if a mismatched parenthesis is detected.

FailOnMalformedExpression

Throw an InfixToRpnFormatException when various other situations occur.

All

All of the options.

None

None of the options.

InfixToRpnFormatException

If FailOnMismatchedParenthesis and/or FailOnMalformedExpression is specified and such a condition is detected, then an instance of this exception will be thrown. The class has a public readonly field named Position that will contain the (one-based) position of the character that appears to be errant.

OperatorPrecedence

This enumeration is used to associate a precedence with an operator in the following class.

Operator

An instance of the Operator class contains only the operator character and its precedence value.

OperatorStack

An OperatorStack is a Stack<Operator>. It has its own versions of Push and Clear which will append removed Operators directly to the output.

OperatorStackStack

An OperatorStackStack is a Stack<OperatorStack>. The InfixToRpn method uses one of these, which is initialized with one OperatorStack. Each OperatorStack represents one pair of parentheses. Each time a left parenthesis is encountered, a new OperatorStack is instantiated and pushed onto the OperatorStackStack. Each time a right parenthesis is encountered, the top OperatorStack is popped off and cleared (the bottom-most OperatorStack does not get popped off).

ValidatorState

This enumeration defines states for the validator.

NoMoreOpsAllowed

We had a second operator.

OnlyOneOpAllowed

We had a left-parenthesis.

UnaryOpAllowed

We had a first operator.

BinaryOpAllowed

We had a right-parenthesis.

TermBegun

We had a term character.

TermEnded

We had a term-ending whitespace operator.

VerbatimTextBegun

In verbatim text mode.

InfixToRpn

The method is rather lengthy so I won't present the whole thing. It consists primarily of a foreach loop across the characters of the Subject string and a switch. Here are the major pieces (result is a StringBuilder member field):

OperatorStackStack ops = new OperatorStackStack ( new OperatorStack ( 0 ) ) ;

int index = 0 ;

ValidatorState validatorstate = ValidatorState.OnlyOneOpAllowed ;

result.Length = 0 ;

result.EnsureCapacity ( Subject.Length * 2 ) ;

foreach ( char thischar in Subject )
{
    index++ ;

    switch ( thischar )
    {
        // See below
    }

    while ( ops.Count > 0 )
    {
        ops.Pop().Clear() ;
    }

    return ( result.ToString() ) ;
}

Inside the switch are (of course) case statements for the various characters that need to be interpreted. Here is the case for multiplication, division, and modulo:

case '*' :
case '/' :
case '%' :
{

    if ( failonmalformedexpression )
    {
        if
        (
            ( validatorstate == ValidatorState.UnaryOpAllowed )
        ||
            ( validatorstate == ValidatorState.NoMoreOpsAllowed )
        )
        {
            throw ( new InfixToRpnFormatException
            (
                "Too many operators in a row"
            ,
                index
            ) ) ;
        }
    }

    ops.Peek().Push
    (
        new Operator ( thischar , OperatorPrecedence.Medium )
    ) ;

    validatorstate = ValidatorState.UnaryOpAllowed ;

    break ;
}

Here is the case for addition, subtraction, unary positive, and unary negative. Notice that unary operators receive Highest precedence while their binary counterparts receive Low precedence:

case '-' :
case '+' :
{
    if ( failonmalformedexpression )
    {
        if ( validatorstate == ValidatorState.NoMoreOpsAllowed )
        {
            throw ( new InfixToRpnFormatException
            (
                "Too many operators in a row"
            ,
                index
            ) ) ;
        }
    }

    if
    (
        ( validatorstate == ValidatorState.UnaryOpAllowed )
    ||
        ( validatorstate == ValidatorState.OnlyOneOpAllowed )
    )
    {
        /* Unary */
        ops.Peek().Push
        (
            new Operator ( thischar , OperatorPrecedence.Highest )
        ) ;

        validatorstate = ValidatorState.NoMoreOpsAllowed ;
    }
    else
    {
        /* Binary */
        ops.Peek().Push
        (
            new Operator ( thischar , OperatorPrecedence.Low )
        ) ;

        validatorstate = ValidatorState.UnaryOpAllowed ;
    }

    break ;
}

A left-parenthesis causes a new OperatorStack to be pushed onto the OperatorStackStack. Here is the case for left-parenthesis:

case '(' :
{
    if ( failonmalformedexpression )
    {
        if
        (
            ( validatorstate == ValidatorState.TermBegun )
        ||
            ( validatorstate == ValidatorState.TermEnded )
        ||
            ( validatorstate == ValidatorState.BinaryOpAllowed )
        )
        {
            throw ( new InfixToRpnFormatException
            (
                "No operator preceding left-parenthesis"
            ,
                index
            ) ) ;
        }
    }

    ops.Push ( new OperatorStack ( index ) ) ;

    validatorstate = ValidatorState.OnlyOneOpAllowed ;

    break ;
}

A right-parenthesis causes the top OperatorStack to be popped from the OperatorStackStack (unless it's the only OperatorStack) and Cleared. Here is the case for right-parenthesis:

case ')' :
{
    if ( failonmalformedexpression )
    {
        if
        (
            ( validatorstate == ValidatorState.UnaryOpAllowed )
        ||
            ( validatorstate == ValidatorState.NoMoreOpsAllowed )
        )
        {
            throw ( new InfixToRpnFormatException
            (
                "Operator followed by right-parenthesis"
            ,
                index
            ) ) ;
        }

        if ( validatorstate == ValidatorState.OnlyOneOpAllowed )
        {
            throw ( new InfixToRpnFormatException
            (
                "Empty parentheses"
            ,
                index
            ) ) ;
        }
    }

    if ( ops.Count == 1 )
    {
        if ( failonmismatchedparenthesis )
        {
            throw ( new InfixToRpnFormatException
            (
                "Mismatched right-parenthesis"
            ,
                index
            ) ) ;
        }

        ops.Peek().Clear() ;
    }
    else
    {
        ops.Pop().Clear() ;
    }

    validatorstate = ValidatorState.BinaryOpAllowed ;

    break ;
}

The last case I'll show is for the terms of the expression, the operands. A term is basically any consecutive characters that are neither operators nor whitespace. (Verbatim text forms a term, and may contain operator characters and whitespace; but verbatim text has its own special processing and I won't cover it here.)

default :
{
    if ( failonmalformedexpression )
    {
        if ( validatorstate == ValidatorState.BinaryOpAllowed )
        {
            throw ( new InfixToRpnFormatException
            (
                "No operator preceding term"
            ,
                index
            ) ) ;
        }

        if ( validatorstate == ValidatorState.TermEnded )
        {
            throw ( new InfixToRpnFormatException
            (
                "Second consecutive term detected"
            ,
                index
            ) ) ;
        }
    }

    result.Append ( thischar ) ;

    validatorstate = ValidatorState.TermBegun ;

    break ;
}

Using the Code

The zip also contains Rpn.cs, a very simple console program. Compile it and LibRpn.cs with your choice of C# compiler.

Sample output:

C:\>rpn 2+3*4
2 3 4 * +
C:\>rpn a/b-c
a b / c -
C:\>rpn x * -4
x
C:\>rpn "x * -4"
x 0 4 - *
C:\>rpn "x - -4"
x 0 4 - -
C:\>rpn "x - 1/4"
x 1 4 / -
C:\>rpn "x - /4"
Too many operators in a row at position 5
C:\>rpn "x ---4"
Too many operators in a row at position 5

History

  • First posted - 2006-01-16
  • Updated - 2006-11-27
  • Updated - 2009-03-10 - Reworked the article. There are only minor changes to the code, mostly to the comments.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here