Introduction
The class presented in this article can be used to transform a mathematical expression from infix notation to Reverse Polish notation. By default, there is no validation of the expression, but some simple validation may be turned on through an Options
parameter.
See Shunting yard algorithm for more information.
Background
I needed this for the ParseInfix
method for my Rational type.
The class exposes only one public
method: the InfixToRpn
method.
string result = PIEBALD.Lib.InfixToRpn
(
source
,
PIEBALD.Lib.InfixToRpn.RpnOptions.None
) ;
string result = PIEBALD.Lib.InfixToRpn
(
source
,
PIEBALD.Lib.InfixToRpn.RpnOptions.FailOnMalformedExpression
) ;
The mathematical operators (+, -, *, /, %, ^ (exponentiation)) and parentheses are supported.
Verbatim text (which can be used for terms in Scientific notation) can be specified between the ' (ASCII 96) and ' delimiters.
Supporting Characters
The following items are also in the file:
RpnOptions
This enumeration is used to provide the user the ability to control some optional features of the InfixToRpn
method.
RetainWhiteSpace
Retain the whitespace, control characters, and null
characters from the source string.
FailOnMismatchedParenthesis
Throw an InfixToRpnFormatException
if a mismatched parenthesis is detected.
FailOnMalformedExpression
Throw an InfixToRpnFormatException
when various other situations occur.
All
All of the options.
None
None of the options.
InfixToRpnFormatException
If FailOnMismatchedParenthesis
and/or FailOnMalformedExpression
is specified and such a condition is detected, then an instance of this exception will be thrown. The class has a public
readonly field named Position
that will contain the (one-based) position of the character that appears to be errant.
OperatorPrecedence
This enumeration is used to associate a precedence with an operator in the following class.
Operator
An instance of the Operator
class contains only the operator character and its precedence value.
OperatorStack
An OperatorStack
is a Stack<Operator>
. It has its own versions of Push and Clear which will append removed Operators directly to the output.
OperatorStackStack
An OperatorStackStack
is a Stack<OperatorStack>
. The InfixToRpn
method uses one of these, which is initialized with one OperatorStack
. Each OperatorStack
represents one pair of parentheses. Each time a left parenthesis is encountered, a new OperatorStack
is instantiated and pushed onto the OperatorStackStack
. Each time a right parenthesis is encountered, the top OperatorStack
is popped off and cleared (the bottom-most OperatorStack
does not get popped off).
ValidatorState
This enumeration defines states for the validator.
NoMoreOpsAllowed
We had a second operator.
OnlyOneOpAllowed
We had a left-parenthesis.
UnaryOpAllowed
We had a first operator.
BinaryOpAllowed
We had a right-parenthesis.
TermBegun
We had a term character.
TermEnded
We had a term-ending whitespace operator.
VerbatimTextBegun
In verbatim text mode.
InfixToRpn
The method is rather lengthy so I won't present the whole thing. It consists primarily of a foreach
loop across the characters of the Subject string
and a switch
. Here are the major pieces (result is a StringBuilder
member field):
OperatorStackStack ops = new OperatorStackStack ( new OperatorStack ( 0 ) ) ;
int index = 0 ;
ValidatorState validatorstate = ValidatorState.OnlyOneOpAllowed ;
result.Length = 0 ;
result.EnsureCapacity ( Subject.Length * 2 ) ;
foreach ( char thischar in Subject )
{
index++ ;
switch ( thischar )
{
}
while ( ops.Count > 0 )
{
ops.Pop().Clear() ;
}
return ( result.ToString() ) ;
}
Inside the switch
are (of course) case
statements for the various characters that need to be interpreted. Here is the case
for multiplication, division, and modulo:
case '*' :
case '/' :
case '%' :
{
if ( failonmalformedexpression )
{
if
(
( validatorstate == ValidatorState.UnaryOpAllowed )
||
( validatorstate == ValidatorState.NoMoreOpsAllowed )
)
{
throw ( new InfixToRpnFormatException
(
"Too many operators in a row"
,
index
) ) ;
}
}
ops.Peek().Push
(
new Operator ( thischar , OperatorPrecedence.Medium )
) ;
validatorstate = ValidatorState.UnaryOpAllowed ;
break ;
}
Here is the case
for addition, subtraction, unary positive, and unary negative. Notice that unary operators receive Highest precedence while their binary counterparts receive Low precedence:
case '-' :
case '+' :
{
if ( failonmalformedexpression )
{
if ( validatorstate == ValidatorState.NoMoreOpsAllowed )
{
throw ( new InfixToRpnFormatException
(
"Too many operators in a row"
,
index
) ) ;
}
}
if
(
( validatorstate == ValidatorState.UnaryOpAllowed )
||
( validatorstate == ValidatorState.OnlyOneOpAllowed )
)
{
ops.Peek().Push
(
new Operator ( thischar , OperatorPrecedence.Highest )
) ;
validatorstate = ValidatorState.NoMoreOpsAllowed ;
}
else
{
ops.Peek().Push
(
new Operator ( thischar , OperatorPrecedence.Low )
) ;
validatorstate = ValidatorState.UnaryOpAllowed ;
}
break ;
}
A left-parenthesis causes a new OperatorStack
to be pushed onto the OperatorStackStack
. Here is the case
for left-parenthesis:
case '(' :
{
if ( failonmalformedexpression )
{
if
(
( validatorstate == ValidatorState.TermBegun )
||
( validatorstate == ValidatorState.TermEnded )
||
( validatorstate == ValidatorState.BinaryOpAllowed )
)
{
throw ( new InfixToRpnFormatException
(
"No operator preceding left-parenthesis"
,
index
) ) ;
}
}
ops.Push ( new OperatorStack ( index ) ) ;
validatorstate = ValidatorState.OnlyOneOpAllowed ;
break ;
}
A right-parenthesis causes the top OperatorStack
to be popped from the OperatorStackStack
(unless it's the only OperatorStack
) and Clear
ed. Here is the case
for right-parenthesis:
case ')' :
{
if ( failonmalformedexpression )
{
if
(
( validatorstate == ValidatorState.UnaryOpAllowed )
||
( validatorstate == ValidatorState.NoMoreOpsAllowed )
)
{
throw ( new InfixToRpnFormatException
(
"Operator followed by right-parenthesis"
,
index
) ) ;
}
if ( validatorstate == ValidatorState.OnlyOneOpAllowed )
{
throw ( new InfixToRpnFormatException
(
"Empty parentheses"
,
index
) ) ;
}
}
if ( ops.Count == 1 )
{
if ( failonmismatchedparenthesis )
{
throw ( new InfixToRpnFormatException
(
"Mismatched right-parenthesis"
,
index
) ) ;
}
ops.Peek().Clear() ;
}
else
{
ops.Pop().Clear() ;
}
validatorstate = ValidatorState.BinaryOpAllowed ;
break ;
}
The last case
I'll show is for the terms of the expression, the operands. A term is basically any consecutive characters that are neither operators nor whitespace. (Verbatim text forms a term, and may contain operator characters and whitespace; but verbatim text has its own special processing and I won't cover it here.)
default :
{
if ( failonmalformedexpression )
{
if ( validatorstate == ValidatorState.BinaryOpAllowed )
{
throw ( new InfixToRpnFormatException
(
"No operator preceding term"
,
index
) ) ;
}
if ( validatorstate == ValidatorState.TermEnded )
{
throw ( new InfixToRpnFormatException
(
"Second consecutive term detected"
,
index
) ) ;
}
}
result.Append ( thischar ) ;
validatorstate = ValidatorState.TermBegun ;
break ;
}
Using the Code
The zip also contains Rpn.cs, a very simple console program. Compile it and LibRpn.cs with your choice of C# compiler.
Sample output:
C:\>rpn 2+3*4
2 3 4 * +
C:\>rpn a/b-c
a b / c -
C:\>rpn x * -4
x
C:\>rpn "x * -4"
x 0 4 - *
C:\>rpn "x - -4"
x 0 4 - -
C:\>rpn "x - 1/4"
x 1 4 / -
C:\>rpn "x - /4"
Too many operators in a row at position 5
C:\>rpn "x ---4"
Too many operators in a row at position 5
History
- First posted - 2006-01-16
- Updated - 2006-11-27
- Updated - 2009-03-10 - Reworked the article. There are only minor changes to the code, mostly to the comments.