Re: Lookahead vs. Scanner Feedback

nigelh@sol.UVic.CA (R. Nigel Horspool)
8 Jan 92 21:40:04 GMT

          From comp.compilers

Related articles
[4 earlier articles]
Re: Lookahead vs. Scanner Feedback Jan.Rekers@cwi.nl (1992-01-07)
Re: Lookahead vs. Scanner Feedback burley@geech.gnu.ai.mit.edu (1992-01-07)
Re: Lookahead vs. Scanner Feedback drw@lagrange.mit.edu (1992-01-07)
Re: Lookahead vs. Scanner Feedback smk@dcs.edinburgh.ac.uk (1992-01-07)
Re: Lookahead vs. Scanner Feedback bill@twwells.com (1992-01-08)
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-08)
Re: Lookahead vs. Scanner Feedback nigelh@sol.UVic.CA (1992-01-08)
Re: Lookahead vs. Scanner Feedback dww@inf.fu-berlin.de (1992-01-08)
Re: Lookahead vs. Scanner Feedback jwoods@convex.com (1992-01-09)
Re: Lookahead vs. Scanner Feedback jwoods@convex.com (1992-01-10)
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-13)
Re: Lookahead vs. Scanner Feedback megatest!djones@decwrl.dec.com (1992-01-13)
| List of all articles for this month |

Newsgroups: comp.compilers
From: nigelh@sol.UVic.CA (R. Nigel Horspool)
Keywords: yacc, parse
Organization: University of Victoria
References: 92-01-012 92-01-026
Date: 8 Jan 92 21:40:04 GMT

Jan.Rekers@cwi.nl (Jan Rekers) writes:


>We use a different approach:
>As we use the GLR parsing technique, which is an LR parser which is
>able to split up in several LR parsers on a conflict, we are able to
>solve this problem in a very general way. The scanner just returns
>all possible interpretations of a token, the parser splits up to
>pursuit the different possibilities; some of these will die in the
>sequel and the correct choice survives. This leaves all information
>about the grammar out of the lexical scanner and is guaranteed to
>work for any case without any inspection of the grammar by a programmer.


I hate to disagree with my friend Jan, but this idea of splitting
off alternative parses introduces other nasty problems and it is
not clear that you would be any better off. Consider the
following C expression:
(foo)-3
This has two different parses depending on whether foo is declared
as a variable or as a typedef name. I.e., if foo is an int variable
then the expression is equivalent to foo-3; but if foo is a typedef
name, then the expression is equivalent to (foo)(-3) where a type
cast is being applied.


If we feed a program containing the k consecutive statements
v1 = (foo)-1;
v2 = (foo)-2;
v3 = (foo)-3;
.
.
vk = (foo)-k;
into Rekers' parser, we should get back at least 2**k different
parses.


The only way to prevent this combinatorial explosion of
possibilities is to kill off the alternative parse possibilities
as they are created. For example, the C grammar rule
<Expression> ::= ( <type-name> ) <Expression>
would need to have an associated semantic test to check whether
the type-name has actually been declared as a type name and, if
not, suppress this particular parsing possibility.


Thus there has to be feedback from semantic analysis into the
parser. I don't believe this is any great improvement over the
conventional approach of having feedback from semantic analysis to
the lexer so that typedef names can be returned as a different kind
of token.


Nigel Horspool
University of Victoria
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.