Re: Writing Grammars

"VBDis" <vbdis@aol.com>
10 Aug 2002 02:01:28 -0400

          From comp.compilers

Related articles
Writing Grammars newsham@lava.net (Tim Newsham) (2002-07-25)
Re: Writing Grammars idbaxter@semdesigns.com (Ira Baxter) (2002-07-31)
Re: Writing Grammars whopkins@alpha2.csd.uwm.edu (Mark) (2002-07-31)
Re: Writing Grammars pfroehli@ics.uci.edu (Peter H. Froehlich) (2002-07-31)
Re: Writing Grammars newsham@lava.net (Tim Newsham) (2002-08-04)
Re: Writing Grammars parsersinc@yahoo.com (SLK Parsers) (2002-08-04)
Re: Writing Grammars vbdis@aol.com (VBDis) (2002-08-10)
| List of all articles for this month |

From: "VBDis" <vbdis@aol.com>
Newsgroups: comp.compilers
Date: 10 Aug 2002 02:01:28 -0400
Organization: AOL Bertelsmann Online GmbH & Co. KG http://www.germany.aol.com
References: 02-08-015
Keywords: parse
Posted-Date: 10 Aug 2002 02:01:28 EDT

"Tim Newsham" <newsham@lava.net> schreibt:


>why not try to manipulate the transduction until there
>is a grammar that accepts the same language but is parseable by an
>efficient parsing technique.


Thanks for enlightening me, I never realized in full depth the
difference between a grammar and a parser.


Can't we then abstract a bit more (or less?), and use the language as
the base, for which a translator must be implemented? With "language"
I here mean a possibly very raw description of the syntactical
elements (rules), and the actions required to transform some input
into output (AST...), based on the rules applicable to the input?


My question is based on the observation, that many grammars are
bloated by splitting general rules into multiple specialized rules,
which more precisecly describe the allowed set of elements in certain
situations. Splitting one rule then can result in a bunch of new
branches or almost duplicated rules elsewhere, which almost tend to
have the same errors in the various copies of the originally unique
rules.


Take "rval" and "lval" as an simple example, where an lval is a
restricted version of an rval. Now I wonder where is the appropriate
place to determine, whether an rval or lval is required, and whether
the input is an rval or lval. The determination can occur in the
parser, which finds out that e.g. a subroutine parameter must be an
lval (reference). But the determination could as well occur in the
transformation (AST or code creation), when the expression is found to
be used as an lval.


But does this make any difference to the overall parsing process? Or
do I simply confuse "simple" languages, where the applicable rules are
determined very soon, and "complicated" languages, where the
applicable rules can be determined only after a big chunk of
unspecific information has been read?


Perhaps I should mention here, that my background are "simple"
languages like Basic or Pascal, and I only have at hand a vague idea
of C++ as an "complicated" language. And that I definitely prefer
top-down parsing... ;-)


Back to the general thread, I suspect that usually there exist far
less chances (patterns) for creating the output, than for interpreting
the input. Wouldn't it then be easier to start with the output
creation, and from there to determine the tokens which can occur in
the input (grammar)? What remains is the determination of the kind of
output, which shall be created for the given input. Since we have
already determined the rules (sub-languages), which are applicable for
the various kinds of output, the remaining task is to find the keys in
these sub-grammars, which allow to guide the parser through the
various alternatives.


Perhaps now I have in mind something like a two-level parser, where
the bottom level parser processes all unambiguous rules, and the top
level parser looks for the patterns which definitely trigger the
creation of a specific kind of output, and then instructs the bottom
level parser how to handle the not yet processed tokens?


You may object that this is just what the shift and reduce processes
do in a traditional parser, but from my point of view the reduce
processes should not (only) be constructed from syntactical
requirements, but instead should be derived from the possible
sub-components of the various output patterns. Then an automated
parser generator can be fed with the basic rules, which can be
determined immediately from the language specification, and with the
output data structures or procedures. Then it should be possible to
automatically create the code or automaton for the missing rules,
which traditionally are given as hand crafted transformations of the
language description.




Just my 0.02 Euro (where's that damn cent symbol on my German keyboard? ;-)


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.