|lexing backwards email@example.com (Stefan Monnier) (2003-04-05)|
|Re: lexing backwards firstname.lastname@example.org (2003-04-07)|
|Re: lexing backwards email@example.com (Chris F Clark) (2003-04-07)|
|Re: lexing backwards firstname.lastname@example.org (Marat Boshernitsan) (2003-04-07)|
|Re: lexing backwards email@example.com (Stan Zaborowski) (2003-04-13)|
|Re: lexing backwards Ron@Profit-Master.com (Ron Pinkas) (2003-04-13)|
|Re: lexing backwards firstname.lastname@example.org (Stefan Monnier) (2003-04-15)|
|Re: lexing backwards cfc@TheWorld.com (Chris F Clark) (2003-04-15)|
|Re: lexing backwards email@example.com (2003-05-06)|
|Re: lexing backwards Ron@Profit-Master.com (Ron Pinkas) (2003-05-14)|
|[4 later articles]|
|From:||Marat Boshernitsan <firstname.lastname@example.org>|
|Date:||7 Apr 2003 00:26:02 -0400|
|Posted-Date:||07 Apr 2003 00:26:02 EDT|
>Could anyone point me to work and experience on lexing text locally
>and backwards ?
>Traditionally lexing is done left-to-right with longest-match regexps.
>But such a lexing system obviously means that if you're in the middle
>of a file and want to know what is the previous lexing token, you end
>up (in the general case) having to lex from the beginning of the file.
Tim A. Wagner had done some work on optimal incremental lexing, which
permits you to start lexing from anywhere in the buffer and re-lexes
as few tokens as possible. His work is described in Chapter 5 of his
dissertation "Practical Algorithms for Incremental Software
Development Environments"  and also in an unpublished paper
"General Incremental Lexical Analysis" .
Wagner's algorithms were designed for use in an interactive program
development environments (Ensemble/Harmonia ) and were also used in
the CodeProcessor editor  for per-keystroke lexical analysis.
(This should serve as a testament for their applicability for program
Space overhead of Wagner's algorithm is 3 integer fields per lexical
token (flex condition state, lookahead, and lookback). This overhead
could be reduced (or even eliminated) in most programming languages by
noticing that for majority of tokens these fields will have default
values. I have some ideas on how this can be implemented; feel free
to write me for more informations.
Computer Science Division, EECS
University of California, Berkeley
Return to the
Search the comp.compilers archives again.