Introduction
This is an article on how to build a basic expression evaluator. It can evaluate any numerical expression combined with trigonometric functions, for now. Constants like e and Pi are also supported.
Background
You should have a basic knowledge of operator precedence. I will explain the Regular Expressions part required here.
The Code
IsNumeric
checks if a character is numeric, and IsFunct
is a custom function that checks if a character responds to one of the symbols corresponding to a function.
tokens.Items.Clear()
expr.Text = Replace(expr.Text, " ", "")
Dim num As String, op As Char
num = ""
op = ""
Dim x As New System.Text.RegularExpressions.Regex("[0-9]\-[0-9]")
For Each abc As System.Text.RegularExpressions.Match In x.Matches(expr.Text)
Dim l = CDbl(abc.Value.Split("-")(0))
Dim r = CDbl(abc.Value.Split("-")(1))
expr.Text = Replace(expr.Text, abc.Value, l & "+-" & r)
Next
expr.Text = Replace(LCase(expr.Text), "sin", "~")
expr.Text = Replace(LCase(expr.Text), "cos", "`")
expr.Text = Replace(LCase(expr.Text), "tan", "!")
expr.Text = Replace(LCase(expr.Text), "sec", "@")
expr.Text = Replace(LCase(expr.Text), "csc", "#")
expr.Text = Replace(LCase(expr.Text), "cosec", "#")
expr.Text = Replace(LCase(expr.Text), "cot", "$")
expr.Text = Replace(LCase(expr.Text), "pi", Math.Round(Math.PI, 4))
expr.Text = Replace(LCase(expr.Text), "e", Math.Round(Math.Exp(1), 4))
For Each s As Char In expr.Text
If IsNumeric(s) Or s = "."c Or s = "-" Or IsFunct(s) Then
num = num & s
End If
If IsOp(s) Then
op = s
tkns.Add(num)
tkns.Add(op)
op = ""
num = ""
End If
Next
tkns.Add(num)
tokens.Items.Clear()
For Each Str As String In tkns
tokens.Items.Add(Str)
Next
The first term you need to know when you are trying to evaluate expressions is "tokenize". It means splitting into simpler parts. For example, "12+5*3" is tokenized to "12","+","5","*", and "3".
Before tokenizing, I thought I would do some substitutions to make it simpler for the programmer.
First, we should remove all the spaces to avoid errors like "Cannot convert " " to Double".
Secondly, treating '-' as an operator might lead to few a complications because when you have expressions like "5*-5", the logic we have provided here will get complicated. So, to support negative numbers with our simple logic, the easiest workaround would be replacing all instances of the form "number-number" with "number+-number". While tokenizing, we write a logic such that the - sign is also concatenated to the string.
We use Regex to do this. I will explain about it later.
Then, I replace functions with symbols. For example, the trigonometric function Sine is replaced with "~". So, when I want to evaluate the expression, I can recognize the "~" symbol and evaluate Sine (the number following it). How I'm doing it will be mentioned in the explanation of the evaluation part.
Then, I replace PI and e with their respective values rounded to 4 decimal digits. One thing to be noticed here is, the user might accidentally type "Pi" or "pI" or "PI". So, in order to handle substitutions in general, better use the LCase
function and search for the replacement string ("pi" in this case) in lower case.
The logic I've been mentioning all along is:
- Read character by character in the expression.
- If the character is a number or a dot or a '-' sign or a symbol denoting a function (like "~" for sine), concatenate it to a string.
- If it is an operator, add the number to a list followed by the operator. Set the values of the number string and the operator string to be "".
- After the loop is over, add the number in the string (
num
in this case) to the list because there will be no operator in the end.
Then, clear the ListBox tokens
and add all the entries in the list (tokens
in this case) so that the user can see how it is tokenized. It is not necessary to include the ListBox in all expression evaluators. This is just to see if the program properly tokenizes the expression.
Dim x As New System.Text.RegularExpressions.Regex("[0-9]\-[0-9]")
For Each abc As System.Text.RegularExpressions.Match In x.Matches(expr.Text)
Dim l = CDbl(abc.Value.Split("-")(0))
Dim r = CDbl(abc.Value.Split("-")(1))
expr.Text = Replace(expr.Text, abc.Value, l & "+-" & r)
Next
Coming back to RegEx or Regular Expressions, one can say that it is very useful in pattern matching. Here, the pattern we are trying to match is of the form "a numerical digit-a numerical digit". Let's take the expression "12-3+15-4".
The matches returned by Regex will be "2-3" and "5-4". Then, we can replace it with "2+-3" and "5+-4" so that we can tokenize (split into parts) the expression as "12","+","-3","+","15","+", "-4" and then evaluate it.
In the declaration of x
, I have given "[0-9]\-[0-9]"
as the pattern string. This pattern string is what the regex finds as matches. The pattern string here means "any digit from zero to nine-any digit from zero to nine". One or more characters within the "[]" will be matched, and the "\-" means, - should be treated as a "-" character and not as what it is meant to be in the Regex syntax. Normally, "-" is used inside "[]" to denote a range like "0-9", or "a-z", or "A-Z". "\" is called an escape sequence that forces the next character to be treated as a literal.
Private Function EvalFunction(ByVal s As String) As Double
If Not IsFunct(s.Chars(0)) Then
Return CDbl(s)
End If
Dim z As Double = CDbl(s.Substring(1, s.Length - 1))
z = z * Math.PI / 180
Select Case s.Chars(0)
Case "~"
Return Math.Sin(z)
Case "`"
Return Math.Cos(z)
Case "!"
Return Math.Tan(z)
Case "@"
Return 1 / (Math.Cos(z))
Case "#"
Return 1 / (Math.Sin(z))
Case "$"
Return 1 / (Math.Tan(z))
End Select
End Function
This function is for evaluating strings like "~30"(sin 30) etc. As I have already mentioned, the symbols I have used will always be present in the first character of the string. So, I check if the first character of the string (s
here) passed is a symbol in that list. If it is not, I return the same string passed as it is. On the other case, I retrieve the number part using the substring function and convert it to degrees. The trigonometric functions will treat the value passed as radians. So, I convert the number to degrees. The conversion is like direct proportion. PI Radians equal 180 degrees. How many degrees equal z
radians?
z = z * PI/180
Then, the respective function is performed on z
and is returned by checking all possible symbols that can be present as the first character of s
.
For i As Integer = 0 To tkns.Count - 1
Dim s = tkns(i)
If TypeOf s Is Char Then
If IsFunct(s) Then
tkns.Item(i) = EvalFunction(s)
End If
Else
If IsFunct(s.chars(0)) Then
tkns.Item(i) = EvalFunction(s)
End If
End If
Next
Dim ind = tkns.IndexOf("^"c)
While Not (ind < 0)
Dim lhs = CDbl(tkns(ind - 1))
Dim rhs = CDbl(tkns(ind + 1))
Dim res = lhs ^ rhs
tkns.Insert(ind, res)
tkns.RemoveAt(ind - 1)
tkns.RemoveAt(ind + 1)
tkns.RemoveAt(ind)
ind = tkns.IndexOf("^"c)
End While
First, we evaluate the functions. Then, we go to the operators.
The logic for an operation is:
- Get the first position (or index is better here) of the operator.
- While the operator exists:
- Get the left value and the right value which are located to the left of the operator and the right of the operator, respectively.
- Evaluate the result.
- Insert the result at the position and remove all the three - left value, right value, and the operator.
- Find the position of that operator.
That's it! We have developed a simple mathematician :)
Points of Interest
Substituting a symbol for the functions made the whole process easier. Treating subtraction as the addition of a negative number was good.
Using Regular Expressions: if you want to learn more about Regex, you can visit this website for a simple, yet good source for learning Regex: http://www.zytrax.com/tech/web/regex.htm.