|Is This Expression Parsing Feasible? Milburn.Young@gmail.com (2007-06-27)|
|Re: Is This Expression Parsing Feasible? firstname.lastname@example.org (Martin Ward) (2007-06-28)|
|Re: Is This Expression Parsing Feasible? email@example.com (2007-06-28)|
|Re: Is This Expression Parsing Feasible? firstname.lastname@example.org (Milburn Young) (2007-06-28)|
|Re: Is This Expression Parsing Feasible? email@example.com (Jeff Kenton) (2007-06-29)|
|Re: Is This Expression Parsing Feasible? firstname.lastname@example.org (Jeff Kenton) (2007-06-30)|
|From:||Jeff Kenton <email@example.com>|
|Date:||Fri, 29 Jun 2007 20:53:51 -0400|
|Posted-Date:||30 Jun 2007 09:49:26 EDT|
> I'm new to expression parsing (but not programming) and have an
> algorithm in mind that I would like critiqued by the community. To
> make an expression tree from an expression ending in a terminator
> (e.g. ';' or '\0'), the lexer returns a random access collection of
> tokens. The parser would scan through the entire list of tokens (left
> to right) noting the token(s) with the highest precedence and making
> sub-trees out of those tokens (on the subsequent pass) and the tokens
> immediately to the left and right of those tokens. This would be done
> until there are no tokens not a part of the expression tree.
First, your method will work but requires multiple passes through the
tokens (as you noted). I wrote an expression parser that did exactly
that a long time ago.
But, you can do something similar in a single pass by using two stacks
-- one for operands and one for operators. It's basically a
shift-reduce scheme. You push each operand as you find it. For
operators, you compare the precedence of the one in hand with the one on
the top of the stack to decide whether to build a sub-tree from the top
of stack operator + operands (the result gets pushed back on the operand
stack). Either way, push the new operator afterwards.
Several tricks are worth noticing. First thing to push on the operand
stack is a beginning of expression marker. The expression terminator
will eventually match it. Parentheses are treated similarly -- the
right paren cancels the left one once the subexpression within is done.
Finally, operators with right to left precedence need to have their
precedence adjusted when pushed on the stack so that they execute in the
This used to be a common parsing technique but isn't mentioned much any
more. Aho and Ullman, The Theory of Parsing, Translation, and Compiling
discusses it (the two volume set, not the Dragon book). See the chapter
on Precedence Grammars.
Return to the
Search the comp.compilers archives again.