|[2 earlier articles]|
|Re: Incorporating comments in syntax tree? email@example.com (1996-02-01)|
|Re: Incorporating comments in syntax tree? firstname.lastname@example.org (1996-02-01)|
|Re: Incorporating comments in syntax tree? email@example.com (Dave Gillespie) (1996-02-02)|
|Re: Incorporating comments in syntax tree? firstname.lastname@example.org (Nadav Aharoni) (1996-02-02)|
|Re: Incorporating comments in syntax tree? email@example.com (Charles Fiterman) (1996-02-02)|
|Re: Incorporating comments in syntax tree? Uwe.Assmann@inria.fr (1996-02-09)|
|Re: Incorporating comments in syntax tree? firstname.lastname@example.org (Greg Titus) (1996-02-13)|
|Re: Incorporating comments in syntax tree? Conor@puddle.demon.co.uk (Conor O'Neill) (1996-02-23)|
|From:||Greg Titus <email@example.com>|
|Date:||13 Feb 1996 00:35:05 -0500|
Uwe.Assmann@inria.fr (Uwe Assmann) writes:
> I just read a nice article on a tool called A* for language
> specific tools. It is based on pattern matching and first-order
> tree traversal. It has a nice method to handle comments, reminding
> me on a b* tree with linear chaining of leaves:
> - link all tokens in a linear list
> - intertwine comments in this list in souce order
> - compose non-terminals by pointers of the terminal tokens to
> elements of the token list
> Thus comments can be reached from normal tokens by previous/next-
> operations, but not from non-terminals. Thus the grammar is not
> complicated at all, but all comments can be reached starting from
> terminal tokens.
> Can anyone comment on experiences with that?
I am using this sort of approach for a code-generation tool which
needs to support making incremental changes to the generated code
(which may have been modified manually by the users in the
meantime). Anywhere in the grammar that there can be whitespace the
(ad-hoc recursive-descent) parser may generate comment and whitespace
tokens which are added to an array of tokens for the current
non-terminal in the parse tree. The tool itself then manipulates the
parse tree, generally ignoring these arrays, and just does an in-order
traversal of the tree walking each array to regenerate the changed
source - comments, formatting and all.
Since this approach essentially attaches comments to the parent
non-terminal containing it there is some extra work outside of the
parser to associate comments with the correct nearby lower-level
construct it "really" refers to. A variable declaration, for instance,
needs to look before and after it in the token array to find comments
on previous lines, or on the same line following the declaration. For
my application this is fine, since the majority of the parse-tree
generally doesn't need to be examined closely, just rewritten
unchanged. If you were producing an object-file format that associated
comments for _all_ symbols, though, you would essentially need an
additional parsing pass over the arrays of tokens to do so.
Omni Development Inc.
Return to the
Search the comp.compilers archives again.