Re: [Query] Tree-based parsing?

fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
18 Apr 1997 01:05:14 -0400

          From comp.compilers

Related articles
[Query] Tree-based parsing? jlilley@empathy.com (John Lilley) (1997-04-16)
Re: [Query] Tree-based parsing? fjh@mundook.cs.mu.OZ.AU (1997-04-18)
Re: [Query] Tree-based parsing? bear@sonic.net (Ray Dillinger) (1997-04-22)
Re: [Query] Tree-based parsing? geert@sun3.iaf.nl (1997-05-04)
Re: [Query] Tree-based parsing? nkramer@cs.cmu.edu (Nick Kramer) (1997-05-04)
Re: [Query] Tree-based parsing? scotts@metaware.com (Scott Stanchfield) (1997-05-08)
Re: [Query] Tree-based parsing? scotts@metaware.com (Scott Stanchfield) (1997-05-08)
Re: [Query] Tree-based parsing? wilson@cs.utexas.edu (1997-05-08)
| List of all articles for this month |

From: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Newsgroups: comp.compilers
Date: 18 Apr 1997 01:05:14 -0400
Organization: Comp Sci, University of Melbourne
References: 97-04-096
Keywords: parse

John Lilley <jlilley@empathy.com> writes:


>I've had this idea recently that surely has been explored by those
>more knowledgeable than I, so I want to know if there are schorarly
>articles available on what I would call a "tree-based" approach to
>source-code parsing.


Mercury uses an approach that is pretty similar to this. Mercury's
syntax is based on that of Prolog, and you can view Prolog's syntax as
also being "tree-based". I suspect that you can probably say the same
thing about Lisp.


>Essentially, I would take the following steps:
>
>1) Lex the input stream into tokens as usual.
>
>2) Convert a token stream directly into a degenerate "AST", which
>would start out as a linked list of all tokens
>
>3) Walk the flat "AST" using LL or LALR methods and match gross
>features of the language that are invariant regardless of local
>syntax.


There's no need for step 2 as a separate step. What you have
described so far is just normal lexing and parsing. The catch is that
the parser only parses the gross features of the language. In the
case of Prolog and Mercury, that means that the parser parses terms.
The parser does not figure out which terms are declarations and which
are rules, and it does not parse the goals in the bodies of rules.
(Excuse me if I slip into logic programming jargon.)


>4) Perform successive refinement steps of the AST.


Well, in the case of Mercury the compiler performs only one refinement
step. (The resulting parse tree is then converted to a simplified
data structure in the next step, but that's a different story...)


In the case of Prolog, these refinement steps may be done at different
times; some of them are even done at runtime.


Anyway, my experience with this in Mercury is that tree-based parsing
works fine.


--
Fergus Henderson <fjh@cs.mu.oz.au>
WWW: <http://www.cs.mu.oz.au/~fjh>
PGP: finger fjh@128.250.37.3
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.