欢迎访问


Classical Solutions To Expression Evaluation Problems

By: Homecox. 12/11/2013

The problem of expression evaluation is a basic one in compiler theory. It usually consists of several basic operators (+, -, *, /, ^) operating on integers, and may involve the use of parenthesis.

Here we discuss several variations of this type of problems and corresponding classical solution(s). The solutions are not unique, but we just introduce some classical ones here. By classical, we mean not "Ad Hoc" methods, but based on solid theory and sound practice.


Problem I: Validate an expression consists of integers 0-9, +/- and parentheses. E.g., 3+(4-(5-(4)). Assume no space in between the characters.

Solution: This is a most basic expression parsing problem. The grammar for this can be written as:

  1. E = F + E | F - E | F
  2. F = (E) | n
  3. n = 0 | [1-9][0-9]*
The formula here can be translated directly into functions. This is recursive descent parsing. The solution is shown below.

If we want to evalute the value of such an expression, we can just replace return type of functions num(), E() and F() from bool to integer.


Problem II: An expression consists of integers, and operators +, -, *. Evaluate its value. E.g., 3+4*5-6. Assume no space in between characters, and the expression is always correct.

Soultion: A solution is to use a value stack for numbers (sn). Whenever the new operator is *, apply the operation, otherwise keep adding new number to the stack (if operator is -, add negative value of the new number). In the end, add up all values in the value stack.

It is easy to see that, if the divide operator / is used, simply use double as the data type for the stack, and add one more row "else if (op == '/') { sn.top() /= num; }" below the line to process '*' should work.

Note that this solution is the simplified version of the 2-stack algorithm to parse infix math expressions. The 2-stack algorithm is called the shunting-yard algorithm, and was invented by Edsger Dijkstra. The shunting-yard algorithm can be used to generate output in Reverse Polish Notation (RPN) or Abstract Syntax Tree (AST).


Problem III: Evaluate a proper expression consisting of float numbers (can be positive or negative), operators +, -, *, / , ^ and parentheses (, ).

Solution A: Here we build a expression parse tree (an Abstract Syntaxt Tree or AST) and then evaluate it. Precedence is used: + and - has the lowest precedence, then * and /, ^ has higher precedence, and whenever ( and ) are used, precedence is incremented by one. This idea can extend to more operators such that it extends to a full functional calculator.

The idea is whenever a subexpression has higher precedence, we insert it as a subtree of the current node. This is operator precedence parsing. Alternative to operator precedence parsing, of course, is to use grammar-directed recursive descent parsing, as in the first problem above.

The entire code is below. Note that the central part of the algorithm is function buildAST(), which is short, only about 30 LOC, and can keep this size when extended to more operators.

Function getNextToken() handles the task of parsing out the next token as either a number of operator, and increment precedence level when '(' is found, with error checking capability. Function getNextToken() can shortened for simpler number requirement.

The constructed parse tree can be easily converted into postfix notation. Evaluation is done with the eval() function easily.

Solution B: This can also be solved using the 2-stack shunting-yard algorithm. Some discussions can be found here or here.

The implementation of the 2-stack shunting-yard algorithm is below. The key function is eval(). A inner while loop is used, since the input string may contain something like a+b*(c+d)*e, so continous reduction upon '*' may be needed. Also note the handling of the precedence of '(' and ')': here its precedence is always 0, but in solution A its precedence is always 1 plus current precedence.

The basic difference between soluations A and B, is that in solution A we create a subtree for an operation of higher precedence, but in solution B we immediately calculate the result of such an operation. That is the single only significant difference. It is easy to extend from B to A, which is to extend the shunting-yard algorithm to use AST as output.


Summary

To summarize, evaluation of math expressions can be solved using recursive descent parsing based on grammar, shunting-yard algorithm based on 2 stacks (invented by Edsger Dijkstra), or operator precedence parsing. In fact, operator precedence is a generalization of the shunting-yard algorithm.

As a last note, all the 3 problems here can be solved using the 2-stack shunting-yard algorithm.

References:

  1. Shunting-yard algorithm
  2. Operator-precedence parser
  3. Recursive descent parser



Copyright © 2013 Homecox