Re: lex question

rkrayhawk@aol.com (RKRayhawk)
1 Apr 1999 00:50:34 -0500

          From comp.compilers

Related articles
lex question dormina@winnipeg-lnx.cc.gatech.edu (Mina Doroudi) (2005-12-02)
Re: lex question nathan.moore@cox.net (Nathan Moore) (2005-12-08)
Re: lex question toby@telegraphics.com.au (toby) (2005-12-08)
lex question alinares@fivetech.com (Antonio Linares) (1999-03-23)
Re: lex question adobni@itron.com.ar (Alejandro Dobniewski) (1999-03-28)
Re: lex question rkrayhawk@aol.com (1999-04-01)
Re: lex question cfc@world.std.com (Chris F Clark) (1999-04-03)
| List of all articles for this month |

From: rkrayhawk@aol.com (RKRayhawk)
Newsgroups: comp.compilers
Date: 1 Apr 1999 00:50:34 -0500
Organization: AOL http://www.aol.com
References: 99-03-077
Keywords: lex, macros, comment

Antonio Linares <alinares@fivetech.com>
Date: 23 Mar 1999 12:29:02 -0500
asks


<<We are looking for a way to implement C language #define directive for
expressions:


      #define <idFunction>([<arg list>]) [<exp>]


example:


      #define Max( x, y ) ( x > y)? x: y


Do you know how to do this using standard lex & yacc ? Any samples ?
>>


With lex/yacc technology establishing terminals for the token
"#define" and grammar rules might not be too tough.


The real nightmare is to conceive of a way to use lex or yacc to
RECOGNIZE instances of macro invocations. Afterall, the lex/yacc table
gen time is over by the time you get to the macro definition and
subsequent invocation. There are no unique terminals to be associated
with macro invocations.


Any such detection would involve productions which would be highly
text oriented, and would perhaps require a text re-evaluate mechanism
that would have to be recursive.


In other words, the output from macro #define productions would be
input for the lex/yacc gen stage! Basically, you cannot get there from
here using the parsing strategy of lex/yacc type tools. And the output
from the macro invocation detection productions would be TEXT input
for the current run of the lexer. You cannot get their either.


It is possible to load macro information into a symbol entry.


But even if you can set up code to walk an AST that countenances it,
the traversal does not feed into the lexer! Instead AST traversal
emits code (text matter) intermediate or final, that is outbound from
the current process. There are languages that do provide epicycles
for parsible text (for example LISP with its eval() function), but
that is not C like.


Macros are PRE-processing text expansions in C like languages. In a
certain sense, the #defines are not part of the language parsing at
all. That is, the #defines are not examples of grammatical C language
statements. They are examples of grammatical statements of a text
preprocessing language that is traditionally associated with the C
language, part of a larger tradition that includes myriad text
processing mechanism from the UNIX culture.


You asked for examples, the GNU GCC compiler illustrates the
complexity of isolating the macro function to a distinct process that
is really entirely separate from the parse.


Your question does not imply that you would necessarily contemplate
the exact same forms for your macros as has the C language. But
should you be considering a syntax like that, also keep in mind that
the parenthetical text strings of macro definitions and invocations
are especially difficult to distinguish from function invocations in
C; unless they are syphoned off in the PRE-process phase.


In the GNU GCC compiler, the #defines pass thru to the compiler, but
it is smart enough to ignore that text at 'compile time.' In other
words, there are actually null productions for #defines.


The macro invocations are actually already expanded, and their
original text form is nolonger visible at 'compile time'. (although
there are parametric combinations, if I am not mistaken, that allow a
commented out artifact of original macro invocation text to pass all
the way through for printing I suppose, but really it does not make it
to the parser).


((To clarify this minor tangent let me state that macro expansion,
under such parameters, requires generation of a commented original
text followed by an expanded text. In a world where that expanded
text must be evaluated for potential further macro text substitution,
generally only the original text is passed through as comments,
intermediate expansions are not passed through as comment nor of
course as parsible code, only the final expansion accompanies the
commented original text. Although professionals might need to plan a
way for all intermediate expansion to be brought on scope for
compiler-preprocessor debugging. Yeehaaa!)).




You might want to review a similar recent discussion in this news group:


Re: yacc: how to fix ambiguity


    "Lemaitre, Laurent" <r29173@email.sps.mot.com>
    On 30 Oct 1998 13:57:39 -0500


et seq.


It appears that the issue has been approached from several angles in
the newsgroups. You may wish to search for - macro* & func* &
comp.compiler* at http://www.dejanews.com/


Best Wishes,


Robert Rayhawk
RKRayhawk@aol.com
[Actually, in GCC the #define's are all handled in the preprocessor and
each line where a #define occurred is blank when the compiler gets it. -John]





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.