Flex and Dynamic Data buffer: Advanced technique

Chris F Clark <cfc@world.std.com>
23 May 2002 01:24:46 -0400

          From comp.compilers

Related articles
Flex and Dynamic Data buffer: Advanced technique ymotiwala@yahoo.com (2002-05-17)
Flex and Dynamic Data buffer: Advanced technique cfc@world.std.com (Chris F Clark) (2002-05-23)
Re: Flex and Dynamic Data buffer: Advanced technique clint@0lsen.net (Clint Olsen) (2002-05-23)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 23 May 2002 01:24:46 -0400
Organization: Compilers Central
References: 02-05-088
Keywords: lex
Posted-Date: 23 May 2002 01:24:46 EDT

Yusuf Motiwala wrote:
> Is it possible to stop the flex when buffer exhausted and restart
> again when more data is available. Flex should restart with previous
> state where it left with previous buffer so that it can process
> splitted tokens. . . .


What you are looking for is an "event driven lexer" or a lexer that
handles "inverted control flow". It may also be called different
things by other people. The basic idea is you want the input to drive
the parsing process, and not vice versa.


I don't know of any free implementations of that concept. In fact, I
only know of one implementation at all (Yacc++, which as a disclaimer
I will say that I have a biased opinion on as one of the co-authors).


Inverting the flow-of-control in the parsing process is not that hard
if you have a table-driven toolset that outputs objects. Simply,
replace the call statements where the parser calls the lexer with
returns and similarly, when the lexer calls the file system, those
also become returns. Next, replace the places where the lexer returns
with a call to the parser. Finally, create an input object that
represents the input stream and have that call the lexer when it has
characters in its buffer. In practice, there are a few more details
than that, but that is the basic concept.


Note, you cannot invert the flow-of-control in a recursive descent
parser, because the call stack of the parser contains implicit state
information. This (and this alone) is the one reason why Yacc++ does
not also generate recursive descent parsers (as well as table driven
ones), despite having developed a technology that would allows us to
do so for LR grammars.


Also the toolset does not have to generate true objects for the
control-flow inversion to work, it just needs to keep its state
somewhere other than the call stack (e.g. static or global variables
would work also, just not as prettily).


Hope this helps,
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.