Re: The speed of an Earley parser?

"Ira D. Baxter" <idbaxter@semdesigns.com>
2 Jul 2001 00:34:54 -0400

          From comp.compilers

Related articles
The speed of an Earley parser? newspub@wuggy.co.uk (2001-06-17)
Re: The speed of an Earley parser? sting@linguist.Dartmouth.EDU (Michael J. Fromberger) (2001-06-21)
Re: The speed of an Earley parser? joachim_d@gmx.de (Joachim Durchholz) (2001-06-21)
Re: The speed of an Earley parser? newspub@wuggy.co.uk (2001-06-28)
Re: The speed of an Earley parser? idbaxter@semdesigns.com (Ira D. Baxter) (2001-07-02)
Re: The speed of an Earley parser? news0@greynode.net (Benjamin S.Scarlet) (2001-08-06)
Re: The speed of an Earley parser? Mark.van.den.Brand@cwi.nl (M.G.J. van den Brand) (2001-08-08)
Re: The speed of an Earley parser? holzmueller@ics-ag.de (2001-08-15)
| List of all articles for this month |

From: "Ira D. Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: 2 Jul 2001 00:34:54 -0400
Organization: Compilers Central
References: 01-06-041
Keywords: parse
Posted-Date: 02 Jul 2001 00:34:54 EDT

"Ian Woods" <newspub@wuggy.co.uk> wrote in message
> I'm working on a semi-learning, semi-serious project which requires
> some CF parsing. There are ... considerations on the selection
> of the parsing technique to be used:
>
> o This one takes a little explanation. Imagine a language construct a
> little like this:
>
> // language foo
>
> lang "bar"
> (
> (* this is language bar! *)
> )
>
> // more lang foo
>
> You can define a block but instead of it being some fixed language,
> the language is specified as part of the 'host' languages construct.
>
> This kind of thing, I imagine, either requires the construction of a
> meta grammar which contains all of the grammars for all of the
> languages with sufficient syntax for the above (which is probably a
> very bad idea since it will be huge)..


Agreed. But it won't be any "huger" than the sum of the sizes of the other
grammars.
The biggest object I'd have would be that it wouldn't be very modular.


> or I need to be able to
> 'switch' from one grammar to another and splice their parse trees
> together (which seems a much better idea to me). This switching
> operation requires that the grammar is effectively a data structure in
> memory rather than encoded into a program (like most of the common
> tools seem to produce - programs which can parse one specific
> grammar).
>
> The question is this:
>
> What parsing techniques have the properties I require?


If you follow the switching idea, then each grammar can technically
have its own parsing technology. In practice, you only want to write
the parsing foundations once, though, so I'm sympathetic to the
requirement for very robust parsing technology that can handle a wide
variety of grammars. That is the driver for CF parsers, in my
opinion, more than anything else.


Earley parsers have been discussed already. For our DMS Software
Reengineering Toolkit (see website if you want more details), we use
Tomita (GLR) parsers.


I won't go into them, because they have already been discussed earlier
:-} this year. The really nice thing about GLR parsers is that they
are explicitly designed to parse ambiguous sentences, which means you
can separate the parsing and semantic analysis phases completely.. I
don't think Earley parsers have any particular support for this


From your perspective, DMS implements exactly the language switching
idea you've outlined above. This allows us to write patterns and
transforms that can operate on mixed-language notations. This is
quite handly when refining syntax in one langauge into syntax in
another; individual transforms can produce results half in one
langauge, half in another. Following transforms can match whatever
parts they need.


--
Ira D. Baxter, Ph.D. CTO Semantic Designs, Inc.
http://www.semdesigns.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.