Re: lex question

Chris F Clark <cfc@world.std.com>
3 Apr 1999 21:58:19 -0500

          From comp.compilers

Related articles
lex question dormina@winnipeg-lnx.cc.gatech.edu (Mina Doroudi) (2005-12-02)
Re: lex question nathan.moore@cox.net (Nathan Moore) (2005-12-08)
Re: lex question toby@telegraphics.com.au (toby) (2005-12-08)
lex question alinares@fivetech.com (Antonio Linares) (1999-03-23)
Re: lex question adobni@itron.com.ar (Alejandro Dobniewski) (1999-03-28)
Re: lex question rkrayhawk@aol.com (1999-04-01)
Re: lex question cfc@world.std.com (Chris F Clark) (1999-04-03)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 3 Apr 1999 21:58:19 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 99-03-077 99-04-002
Keywords: lex

Antonio Linares <alinares@fivetech.com> asked:
> We are looking for a way to implement C language #define directive for
> expressions: [in]
> Do you know how to do this using standard lex & yacc ?


To which rkrayhawk@aol.com (RKRayhawk) replied:
> With lex/yacc technology establishing terminals for the token
> "#define" and grammar rules might not be too tough.
>
> The real nightmare is to conceive of a way to use lex or yacc to
> RECOGNIZE instances of macro invocations. Afterall, the lex/yacc table
> gen time is over by the time you get to the macro definition and
> subsequent invocation. There are no unique terminals to be associated
> with macro invocations.


This is not a nightmare, although it is highly version dependent. You
simply need to have a stack of input objects that your lexer reads
from. It is the same mechanism you would use to implement include
files. Each input object represents a different source.


You push an input object onto the stack (causing your lexer to read
the next characters from that one) whenever you reach a point where
the input needs to process a new include file (or macro reference).
At the end of the input (EOF on that input object), you pop the stack
and return to reading from where the lexer left off.


One difference between macro replacements and include files, is that
you probably represent your macros as strings rather than true files,
so your lexer needs to be able to read from strings as well as files.


The version dependence comes from the variety of slightly different
input macros used by the different versions of "standard" LEX. Of
course, there are untold number of places that simply solve that
problem by using one version of LEX and taking the output files to all
the different platforms. Using FLEX is another solution to that
problem.


The input object stack solves one part of the problem, getting the
text reevaluated. The other part is done by symbol table lookups.
Mark macros in your symbol table with a special type and use that type
to override the token type like you would doing keyword lookups.
(Presuming that you use the symbol table to do keyword lookups rather
than hardcoding the keywords into your LEX grammar.)


With a little cleverness you can even get the symbol table to handle
ANSI style "blue" macros and parameter replacements.


Finally, there are at least two "buy" versus build options.


First, most languages which need a preprocessor simply invoke the C
preprocessor. The only pain there is figuring out how to invoke it
portably. Look at autoconfig for how it determines what command line
will invoke the C preprocessor and you will have covered most
Unix-like systems. That solution is "free" (zero monetary cost) and
has the benefit of leveraging all the effort the C designers put into
determining exactly how a preprocessor should resolve some of the
corner-case issues.


If you want more control of how the preprocessing is done (or don't
want to deal with the subtle variations in semantics that come with
different implementations), you could also use a lexer/parser
generator where the support for input object stacks and varying types
of input are built-in. There is at least one such tool, "Yacc++(r)
and the Language Objects Library". (Disclaimer, I helped write that
tool.) I think you will find similar support in ANTLR and PCCTS--if
not directly in one of the many user contributions the Tom Moog has
put together.


Hope this helps,
-Chris
*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. CompuServe : 74252,1375
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)
------------------------------------------------------------------------------
Web Site in Progress: Web Site : http://world.std.com/~compres





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.