Introduction
One day, when I was 15 years old, I went to a shop near my house and I saw a scientific calculator. When I used the calculator, there was something that took my attention. The calculator was not like the other simple calculators I used to use (at that time). There was a bar in the top of the display of the calculator that you type any expression in (e.g. 5*sin(pi/2)+5) and the calculator evaluates the result. I was amazed how the calculator understood the text, i.e. parsed the expression, and evaluated the result. Now, after being 18, I decided to make a series of programs that do the same thing. This is my first program in this series.
This article explains a simple program that parses simple expressions containing only +, -, *, /, and numbers (e.g. 5+6*3.3) and evaluates the result.
Let's Do It
I dislike too much talk, so I think the best way to illustrate my idea is to begin with an example. Consider the following expression:
1*2*3-4*5/6.6+2
The first thing we should put in mind is that multiplication and division have precedence over addition and subtraction. We should find some way to implement this. The best method is to partition the expression into terms. Thus the result will be to evaluate the following terms:
1*2*3
4*5/6.6
2
After evaluating each term alone, we add them putting in mind the sign before each expression. For example, the expression 4*5/6.6 has a minus sign, so we should subtract the value of the term from the final result instead of adding it. For this example, the following sequence will occur:
- The value of
fResult
(which we will use for the final result of the expression) will be set to zero.
- As we are using right to left parsing, terms to the right will be evaluated before terms to the left.
- The term 2 will be evaluated and added to
fResult
. Thus fResult
will have the value of 2.
- The term 4*5/6.6 (which has the value of 3.030303) will be evaluated and subtracted (because of the minus sign) from the final result. Thus
fResult
will be containing -1.030303.
- The term 1*2*3 (which has the value of 6) will be evaluated and added to the final result. Thus
fResult
will be containing 4.969697.
Let's examine the following segment of code:
bool EvaluateExpression(const char * strExpression, double & fResult)
{
int nLength = strlen(strExpression);
int nEnd = nLength - 1;
double fTerm;
fResult = 0.0;
for (int i = nLength - 1; i >= 0; i--)
if (strExpression[i] == '+')
{
if (!EvaluateTerm(strExpression + i + 1, nEnd - i, fTerm))
return false;
fResult += fTerm;
nEnd = i-1;
}
else
if (strExpression[i] == '-')
{
if (!EvaluateTerm(strExpression + i + 1, nEnd - i, fTerm))
return false;
fResult -= fTerm;
nEnd = i-1;
}
if (strExpression[0] != '+' && strExpression[0] != '-')
{
if (!EvaluateTerm(strExpression, nEnd + 1, fTerm))
return false;
fResult += fTerm;
nEnd = i-1;
}
return true;
}
As we see, the function searches the string of the expression from right to left for any '+' or '-'. A term lies either between two signs ('+' or '-') or between one sign and one end of the expression. After finding a term, the function EvaluateExpression
calls another function, which is EvaluateTerm
to evaluate the value of the found term. In a minute, we will be seeing how the function EvaluateTerm
works.
After the function has evaluated the value of the term, it examines its sign and adds the value of the term to the final result.
Now, let's take a look at the function EvaluateTerm
. This function almost has the same idea as the function EvaluateExpression
. Just like how we did with expression, the better method to find the value of a term is to partition it into numbers and evaluate the term, putting in mind the sign before each number. For example:
4*5/6.6
We should partition this term into:
4
5
6.6
and then evaluate the results considering the sign of each number. For example, for the term 4*5/6.6, the following sequence will occur:
- The value of
fResult
(which we will use for the final result of the term) will be set to one (instead of zero).
- As we are using right to left parsing, numbers to the right will be evaluated before terms to the left.
fResult
will be divided by 6.6 to result in 0.151515.
fResult
will be multiplied by 5 to result in 0.757576.
fResult
will be multiplied by 4 to result in 3.030303 which is the final result of the term.
Let's examine the code of the function:
bool EvaluateTerm(const char * strTerm, int nTermLength, double & fResult)
{
int nEnd = nTermLength - 1;
fResult = 1.0;
double fNumber;
for (int i = nEnd; i >= 0; i--)
if (strTerm[i] == '*')
{
if (!StrToFloat(strTerm + i + 1, nEnd - i, fNumber))
return false;
fResult *= fNumber;
nEnd = i-1;
}
else
if (strTerm[i] == '/')
{
if (!StrToFloat(strTerm + i + 1, nEnd - i, fNumber))
return false;
fResult /= fNumber;
nEnd = i-1;
}
if (strTerm[0] != '*' && strTerm[0] != '/')
{
if (!StrToFloat(strTerm, nEnd + 1, fNumber))
return false;
fResult *= fNumber;
}
return true;
}
As we see, the function traces the term from right to left looking for either '*' or '/'. As is the case for a term, a number lies either between two signs (now they are '*' and '/' instead of '+' and '-') or between one sign and one end of the term.
Note: The StrToFloat
function converts a string to the value it represents.
Final Notes
Before I finish my article, I want to mention the following two points:
- The user should verify that the expression doesn't contain any invalid characters. This is done in the sample using the function
VerifyExpression
. This function searches the expression string for any character other than '+', '-', '*', '/', '.', and '0'-'9'. If it finds such character(s), it will notify that the expression is invalid. It is left to the reader to see the function in the code because it's not necessary to lengthen the article too much with easy stuff.
- In this sample, I didn't use exception handling because it is not the subject of the article. But in a real application, it is better to use exception handling instead of just returning 'false' which will not tell us more than "your expression couldn't be evaluated"!
Questions
- I was about to enter 5*6 in the program, but I decided, as a test to the
VerifyExpression
function which doesn't accept white spaces as valid characters, to enter the expression 5 *6 and 5* 6. I expected that the program will display "Invalid expression!", but for my surprise, the program didn't and, instead, it displayed 5 for the first test and 0 for the second test. Why?
- The following expressions are invalid expressions but the program doesn't violate them and it may produce unexpected results:
5**6
5
1+*2
5*+6
Run the program and test them. Can you find out the reason behind the answer? Also, can you solve these problems?
- In my program, I used right to left parsing. Why do you think I did? Try to make a similar program that uses left to right parsing. Which is easier?
Remark: If you take a look at the VerifyExpression
function, you will notice that the function just searches for invalid characters. For example, the expression 5**6 will pass the test of VerifyExpression
function. How can you discover such syntax errors?
Well, if you just solve the problems mentioned in question 2, you will find that such syntax errors will automatically be discovered, how come that?