Introduction
This is an alternative tip to the original Converting InFix to PostFix using C#[^]. A much clearer way to implement such a simple syntax is to write a Recursive Descendent Parser[^].
My opinion is that the Shunting Yard Algortihm[^] as presumably implemented in the original post is a dead-end road regarding maintainability and extensibility. E.g. implementing an unary minus leads to complicated tricks to get it working, one has to tweak with weird precedence tables, etc.
The recurive descendent parser implements the precedence table implicitly. And further operators are easily added by inserting new unary or binary parse funcitons according to the extended syntax. More to that in a minute.
Let's start with the syntax of the language.
The Syntax
The syntax of the given language in the original post is as follows, written in EBNF[^]:
Expr = Sum .
Sum = Mul { ("-"|"+") Mul } .
Mul = Factor { ("*"|"/"|"%") Factor } .
Factor = Number | "(" Expr ")" .
This is easily imlemented as Recursive Descendent Parser - I show this just a few lines further down. But first: The parser needs tokens to parse. So the scanner provides these tokens.
The Scanner
The scanner decomposes the incomming character stream into chunks that make up the language words and punctuation, all together called tokens. We have:
- Number: integer values like
1
, 23
, 456
, etc.
-
+
*
/
%
(
)
White spaces between these shall be ignored.
A possible implementation for a scanner is a regular expression.
private IEnumerator<string> _tokens;
private bool Move() { return _tokens.MoveNext(); }
private InFixToRpn(string inFix)
{
string syntax = @"\s*(\d+|[-+*/%()]|\S)\s*";
_tokens = Regex.Matches(inFix, syntax, RegexOptions.Compiled)
.Cast<Match>().Select(t => t.Groups[1].Value).GetEnumerator();
Move();
}
Note: the regualr expression could be drastically reduced due to the fact that we only have single character operators:
...
string syntax = @"\s*(\d+|\S)\s*";
_tokens = Regex.Matches(inFix, syntax, RegexOptions.Compiled)
.Cast<Match>().Select(t => t.Groups[1].Value).GetEnumerator();
...
There is more to the scanner (see further down), but this is just the basics: decomposing the character stream by use of regular expression[^].
The Parser
The parser reads these tokens one after the other and tries to make sense of it all.
As said above, the EBNF can easily be translated into the Recursively Descendent Parser: implement each lefthand side into a function.
private void ParseExpr() { ParseSum(); }
private void ParseSum() { ParseBinary(ParseMul, "-", "+"); }
private void ParseMul() { ParseBinary(ParseFactor, "*", "/", "%"); }
private void ParseFactor() { if (IsNumber) ParseNumber(); else ParseNested(); }
private void ParseNumber() { _stack.Push(Curr); Move(); } private void ParseNested()
{
if (Curr != "(" || !Move()) Error("(...) expected");
ParseExpr();
if (Curr != ")") Error("trailing ')' expected");
Move(); }
private void ParseBinary(Action parse, params string[] ops)
{
parse();
string op;
while ((op = CurrOpAndNext(ops)) != null) { parse(); _stack.Push(op); }
}
And again, this is not all of the parser code, but this is the core. And the strong statement: there is a clear 1:1 relation between the EBNF and the parser functions. Once you understand the EBNF, you understand the parser in general. The details may need some closer look, though.
Put it all together
Now that we have all the relevant parts together, lets make it work with all the pieces in place:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
namespace AlternativeTipForInFixToPostFixParser {
public class InFixToRpn
{
public class Res
{
public Stack<string> Stack { get; private set; }
public Exception Error { get; private set; }
public bool HasError { get { return Error != null; } }
internal Res(Stack<string> stack, Exception exc) { Stack = stack; Error = exc; }
}
public static Res Parse(string inFix) { return new InFixToRpn(inFix).Parse(); }
private IEnumerator<string> _tokens;
private Stack<string> _stack;
private string Curr { get { return _tokens.Current ?? string.Empty; } }
private bool Move() { return _tokens.MoveNext(); }
private char Ch { get { return string.IsNullOrEmpty(Curr) ? ' ' : Curr[0]; } }
private bool IsNumber { get { return char.IsNumber(Ch); } }
private string CurrOpAndNext(params string[] ops)
{
string s = ops.Contains(Curr) ? Curr : null;
if (s != null && !Move()) Error("data expected");
return s;
}
private void Error(string msg) { throw new ArgumentException("Error: " + (msg ?? "unknown error")); }
private InFixToRpn(string inFix)
{
string syntax = @"\s*(\d+|[-+*/%()]|\S)\s*";
_tokens = Regex.Matches(inFix, syntax, RegexOptions.Compiled)
.Cast<Match>().Select(t => t.Groups[1].Value).GetEnumerator();
Move();
}
private Res Parse()
{
_stack = new Stack<string>();
Exception exc = null;
try { ParseExpr(); if (Curr != "") Error("too much text given"); }
catch (Exception e) { exc = e; }
return new Res(_stack, exc);
}
private void ParseExpr() { ParseSum(); }
private void ParseSum() { ParseBinary(ParseMul, "-", "+"); }
private void ParseMul() { ParseBinary(ParseFactor, "*", "/", "%"); }
private void ParseFactor() { if (IsNumber) ParseNumber(); else ParseNested(); }
private void ParseNumber() { _stack.Push(Curr); Move(); } private void ParseNested()
{
if (Curr != "(" || !Move()) Error("(...) expected");
ParseExpr();
if (Curr != ")") Error("trailing ')' expected");
Move(); }
private void ParseBinary(Action parse, params string[] ops)
{
parse();
string op;
while ((op = CurrOpAndNext(ops)) != null) { parse(); _stack.Push(op); }
}
}
}
The call to it:
class Program
{
public static void Main(string[] args)
{
string input = "(9*(1-3))-4*(3+3*4)";
var res = InFixToRpn.Parse(input);
Console.WriteLine("Error: {0}{1}", res.HasError, res.HasError ? " ("+res.Error.Message+")" : "");
Console.WriteLine("Input: {0}", input);
Console.WriteLine("Output: {0}", string.Join(" ", res.Stack.Reverse()));
}
}
This results in:
Has Error = False
Input: (9*(1-3))-4*(3+3*4)
Output: 9 1 3 - * 4 3 3 4 * + * -
--------- That's all Folks! ---------
BTW: Checkout the links as embedded in the text.
History
V1.0 2012-04-23 Initial version.