Re: keywords and identifiers..

wmm@fastdial.net
16 Sep 1999 01:53:07 -0400

          From comp.compilers

Related articles
keywords and identifiers.. ash_nw@my-deja.com (1999-09-11)
Re: keywords and identifiers.. wmm@fastdial.net (1999-09-16)
Re: keywords and identifiers.. world!cfc@uunet.uu.net (Chris F Clark) (1999-09-16)
Re: keywords and identifiers.. webid@asi.fr (Armel) (1999-09-16)
Re: keywords and identifiers.. jerrold.leichter@smarts.com (Jerry Leichter) (1999-09-20)
Re: keywords and identifiers.. genew@shuswap.net (1999-09-24)
Re: keywords and identifiers.. delta-t@t-online.de (Leif Leonhardy) (1999-09-27)
| List of all articles for this month |

From: wmm@fastdial.net
Newsgroups: comp.compilers
Date: 16 Sep 1999 01:53:07 -0400
Organization: Deja.com - Share what you know. Learn what you don't.
References: 99-09-045
Keywords: lex

    ash_nw@my-deja.com wrote:
> Had a small question concerning keywords/identifiers and possibly
> languages in general. Are there languages which allow a keyword to be
> accepted as an identifier.
>
> And, if a compiler were to be written for such a language, how would
> the parser/laxical analyzer interact. My idea here is for the parser
> to specify to the lexical analyzer what token (or set of tokens) it
> expects and the lexical analyzer checks to see if the current token
> lies in that set, else it signals an error.
> [Yes, lots of languages don't reserve their keywords, with Fortran and
> PL/I being among the better known examples. One approach is to concoct
> an extended syntax with rules like this that permit keywords to act as
> symbols.
>
> symbol: SYMBOL | IF | THEN | ELSE | ...
>
> but that usually ends up being hopelessly ambiguous. The other approach
> is to keep enough lexical state to know when a keyword is possible and
> when not, and for the lexer to hand back the appropriate kind of token.
> I've done that in a Fortran parser. It wasn't very hard, although some
> of the state management kludges were really ugly. -John]


There's another possibility that I'm sure has been used many times;
shortly after I "invented" it, I discovered that another person two
cubes down from me had done the same thing a few months earlier
without my knowledge!


In this approach, the lexer and the parser are written as if the
keywords were reserved. The only difference is that, whenever the
lexer has detected a keyword, it inspects the current parser state to
see if that keyword is expected at that point in the parse. If so, it
returns the keyword as the token, otherwise it returns an identifier.


Determining whether the keyword is expected in the current state is a
little tricky but not too difficult. I was using Bison, and it's
relatively straightforward to see how to navigate the tables by
reading bison.simple. I'd post the code except that my employer is
pretty proprietary about such things.


Anyhow, that's a relatively automatic way of handling things without
the hair of worrying about managing lexer state and how it interacts
with parser lookahead and such; it worked out quite well with the SQL
parser I was writing.
--
William M. Miller, wmm@fastdial.net
Software Emancipation Technology (www.setech.com)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.