Re: Lookahead vs. Scanner Feedback

Raul Deluth Miller-Rockwell <rockwell@socrates.umd.edu>
Sat, 4 Jan 92 19:40:47 est

          From comp.compilers

Related articles
Lookahead vs. Scanner Feedback hjelm+@cs.cmu.edu (1992-01-03)
Re: Lookahead vs. Scanner Feedback rockwell@socrates.umd.edu (Raul Deluth Miller-Rockwell) (1992-01-04)
Re: Lookahead vs. Scanner Feedback [erratum] rockwell@socrates.umd.edu (Raul Deluth Miller-Rockwell) (1992-01-04)
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-07)
Re: Lookahead vs. Scanner Feedback sef@kithrup.COM (1992-01-07)
Re: Lookahead vs. Scanner Feedback Jan.Rekers@cwi.nl (1992-01-07)
Re: Lookahead vs. Scanner Feedback burley@geech.gnu.ai.mit.edu (1992-01-07)
Re: Lookahead vs. Scanner Feedback drw@lagrange.mit.edu (1992-01-07)
[9 later articles]
| List of all articles for this month |

Newsgroups: comp.compilers
From: Raul Deluth Miller-Rockwell <rockwell@socrates.umd.edu>
Keywords: yacc, parse
Organization: Compilers Central
References: 92-01-012
Date: Sat, 4 Jan 92 19:40:47 est

Mark Hjelm:
      I have a parser, written using Yacc and Lex, for ANSI C. The
      grammar is taken pretty much verbatim from the standard. The
      scanner uses the symbol table to decide whether to return
      "identifier" or "typedef name" as the token type for an identifier.
      How do I KNOW that there are no situations which, due to parser
      lookahead, would cause the scanner to return an incorrect token
      type for an identifier (i.e. return "identifier", even though the


The only uncertainty here is what determining what happens while a
name is being declared.


If I understand aright, you're defaulting an alphanumeric token to be
an identifier unless it first appears in a typedef as one of the
declared names. So I expect you'll have to have a production which
will declare an identifier as a typedef name. For example
(oversimplifying away things like pointer and array specifiers):


typedef_decl
: typedef_prelude identifier {decl_typedef($1, $2); $$= $1;}
| typedef_decl ',' identifier {decl_typedef($1, $3); $$= $1;}
;


typedef : typedef_decl ';' ;


where decl_typedef() is responsible for the paranoia checks and symbol
table manipulations. After being hit by the above production, that
specific identifier will be recognized by LEX as an identifier.


What about lookahead and backtracking? The terminating ';' is
guaranteed to eat YACC's one token lookahead, and backtracking only
occurs when an error is encountered -- and I hope you're not expecting
perfection in the face of a syntax error?


Also, note that an attempt to declare the same name twice in the same
typedef will be a syntax error. For example:


      typedef int foo, foo;


Here, when "typedef int foo" is reduced (as
typedef_prelude identifier_), you are guaranteed that the input has
not been lexed any farther than "typedef int foo," so when the second
"foo" is lexed, it will be recognized as a typedef name, even with
array specifier stuff in the grammar.


--
Raul Deluth Miller-Rockwell <rockwell@socrates.umd.edu>
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.