Building Abstract Syntax Trees from LL(1) Grammars

"Bart T." <bart@dynarec.com>
14 Aug 2002 02:22:29 -0400

          From comp.compilers

Related articles
Building Abstract Syntax Trees from LL(1) Grammars bart@dynarec.com (Bart T.) (2002-08-14)
Re: Building Abstract Syntax Trees from LL(1) Grammars no@mail.com (Capitaine Caverne) (2002-08-23)
Re: Building Abstract Syntax Trees from LL(1) Grammars thp@cs.ucr.edu (2002-08-23)
| List of all articles for this month |

From: "Bart T." <bart@dynarec.com>
Newsgroups: comp.compilers
Date: 14 Aug 2002 02:22:29 -0400
Organization: http://groups.google.com/
Keywords: parse
Posted-Date: 14 Aug 2002 02:22:29 EDT

Hello,


I've been slowly learning about compilers (using the Dragon's book and
Holub's "Compiler Design in C") and have decided to implement a very
simple math expression language before moving on to a more complete
compiler.


The grammar is LL(1) and the parser will be stack-based and hopefully
table driven (I'm going to do it by hand.) I also want to learn to use
yacc using this language and I'll implement a second parser with
it. The goal is to be able to swap one out and recompile with the
other -- both should generate the same syntax trees.


Anyway, these trees must be generated from the bottom up which seems a
bit awkward in a top-down stack-based parser. I know how to use a
value (or attribute, as I like to think of it) stack alongside the
actual parse stack, but I've only tried using it to pass down
inherited attributes. For generating trees from the bottom up,
passing attributes up the stack would be useful (I think these are
called "synthesized" attributes, IIRC.) I think I came up with a way
of doing it, but it is easy to break if not used carefully.


Another way of doing it would be to have the parser output the tokens
in postfix form and then build the syntax tree from the top down
recursively starting with the last token and moving to the left.


Assuming ^ is the exponent operator, 1^2^3^4 would get turned into:
1234^^^ (I believe this handles right associativity of the ^ operator
correctly), and the tree constructed would be:


                        ^
                      / \
                    1 ^
                          / \
                        2 ^
                              / \
                            3 4


Is this an acceptable solution? It sort of feels like cheating to me,
because I just can't figure out a clean way to generate the tree in
one pass. Another intermediate representation I wanted to do was a
LISP-like prefix form:


9+5-2:


(- (+ 9 5) 2)


OR


(+ 9 (- 5 2))


However, it isn't possible to generate the operators prior to the
operands in my grammar. :( At least I can't think of a way. There must
be a way, though...


Thanks,


----
Bart


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.