Abstract syntax tree grammars and generator tools

"Ralph P. Boland" <rpboland@math.uwaterloo.ca>
14 Mar 2003 11:02:41 -0500

          From comp.compilers

Related articles
Combinator/Monadic parsing tools. olczyk@interaccess.com (Thaddeus L Olczyk) (2003-02-24)
Re: Combinator/Monadic parsing tools. peteg@cse.unsw.EDU.AU (Peter Gammie) (2003-03-09)
Abstract syntax tree grammars and generator tools rpboland@math.uwaterloo.ca (Ralph P. Boland) (2003-03-14)
Re: Abstract syntax tree grammars and generator tools martin.jourdan@free.fr (2003-03-16)
Re: Abstract syntax tree grammars and generator tools chase@theworld.com (David Chase) (2003-03-17)
Re: Abstract syntax tree grammars and generator tools Nicola.Musatti@ObjectWay.it (2003-03-22)
| List of all articles for this month |

From: "Ralph P. Boland" <rpboland@math.uwaterloo.ca>
Newsgroups: comp.compilers
Date: 14 Mar 2003 11:02:41 -0500
Organization: University of Waterloo
References: 03-02-148 03-03-009
Keywords: parse, question
Posted-Date: 14 Mar 2003 11:02:41 EST

Is there such a thing as an abstract syntax tree grammar
and the generator tools to process them?


What I want is a grammar that:


1) specifies the abstract syntax trees of a language.
        This probably means additional things such as types of the nodes.


2) supports some standard attribute evaluation and semantic actions such as
        add this variable name and its type to the symbol table.




So far this doesn't sound so difficult. The thing I am really
looking for seems harder.


I want an abstract syntax tree builder generator tool that takes as input


        1) the grammar for the input language from which parse trees are built
              (we'll ignore lexical analysis for now).


        2) the grammar for the abstract syntax of the input language.


and builds a program that constructs the corresponding abstract syntax
tree from a given parse tree (built by a parser built by a parser
generator tool of course).


Such a tool provides two advantages.


1) There is no need to place attributes or semantic actions in the
grammar for the language.
          I think such attributes and semantic actions should be in the
abstract syntax tree grammar (if not later in the compile process).


2) Based on the parse syntax grammar and the abstract syntax grammar
much work can now be done automatically (and thus programming language
independantly) by the abstract syntax tree builder generated by the
abstract syntax tree builder generator tool.
          This reduces the amount of programming work needed to be done by
the compiler writer.


3) The abstract syntax grammar provides additional specification of
the language.




I will give two examples. In both cases I use a Smalltalk like
syntax. For examble in Smalltalk instead of calling a function as (in
C) rotate(a,b,c) one would write




a rotate: b by: c.


This syntax provides additional information that I think would be
useful for an abstract syntax tree builder generator tool. Note that
a function with no arguments (such as reverse(a) in C) looks like:


a reverse.




In addition, I will use regular expression operators which I think is
a must for abstract syntax tree grammars.


First example:


Parse grammar production: VarList:: 'var' varName (',' varName)* ':' typeName ';' .


Abstract grammar production: VarList:: (varNode withName: varName
                                                                                                      withType: (symbolTable getType: typeName)
                                                                                      addToSymbolTable )*.




Note the location of the (regular expression) * operators in each
production. The abstract syntax tree builder must recognize from
these two rules that, for each varName processed by the parser when
the "VarList" production is recognized, the same typeName must be
assigned to the var object being created (which becomes a node in the
abstract syntax tree). Note that a list operator would have been much
cleaner than the star operator here.




Thus we need to have a set of rules for analysing abstract syntax tree
grammars that are easy for the abstract grammar writer to apply in
writing an abstract syntax tree grammar and possible (not necessarily
easily) for the abstract syntax tree builder generator tool to analyse
sufficiently to build an abstract syntax tree builder.




Second (more complex) example:




Parse grammar production:


VarList:: 'var' varName (',' varName ('=' constValue)? )* ':' typeName ';' .




Abstract grammar production:


VarList:: ( (varNode withName: varName
                                                withType: (type = symbolTable getType: typeName)
                                                withInitValue: type value: constValue ? type value: defaultValue)
                                addToSymbolTable. )*.




These examples are unnecessarily messy in that they do a lot of symbol
table things. I expect most abstract syntax grammar rules would have
fewer attribute evaluation operations and semantic actions so would in
fact be a fairly clear specification of the abstract syntax of the
language.


Note that my description implies that the parse tree is first built
and then the abstract syntax tree is built from the parse tree. In
practice I think the parser and abstract syntax tree builder would be
combined into one tool that did both tasks in a single pass.


Thanks


Ralph Boland


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.