Re: what scanner scheme is efficient?

vern@daffy.ee.lbl.gov (Vern Paxson)
21 Nov 1996 23:20:15 -0500

          From comp.compilers

Related articles
[6 earlier articles]
Re: what scanner scheme is efficient? clark@quarry.zk3.dec.com (1996-10-24)
Re: what scanner scheme is efficient? james@wgold.demon.co.uk (James Mansion) (1996-10-24)
Re: what scanner scheme is efficient? jlilley@empathy.com (1996-10-30)
Re: what scanner scheme is efficient? vern@daffy.ee.lbl.gov (1996-11-12)
Re: what scanner scheme is efficient? jlilley@empathy.com (1996-11-15)
Re: what scanner scheme is efficient? adrian@dcs.rhbnc.ac.uk (1996-11-19)
Re: what scanner scheme is efficient? vern@daffy.ee.lbl.gov (1996-11-21)
Re: what scanner scheme is efficient? adrian@dcs.rhbnc.ac.uk (1996-11-24)
Re: what scanner scheme is efficient? jlilley@empathy.com (1996-12-01)
| List of all articles for this month |

From: vern@daffy.ee.lbl.gov (Vern Paxson)
Newsgroups: comp.compilers
Date: 21 Nov 1996 23:20:15 -0500
Organization: Lawrence Berkeley National Laboratory, Berkeley CA
References: 96-10-076 96-11-103 96-11-123
Keywords: lex, performance

John Lilley (jlilley@empathy.com) wrote:
: What sample have you examined to arrive at the conslusion that *most*
: scanners require backing-up? The languages that I know of (C, C++,
: Pascal) do not require any backtracking to support their keyword sets.


For some reason John's posting never made it to our site. He also sent
it to me via private email, so that was how I answered it. Anywhere,
here's my reply.


- Vern


A Johnstone <adrian@dcs.rhbnc.ac.uk> wrote:


> What sample have you examined to arrive at the conslusion that
> *most* scanners require backing-up? ...
> Most of the time, a single character of lookahead with no
> backtracking is sufficient.


Pick a flex scanner you have lying around and run it through flex -b and
you'll see what I mean.


You're right that keywords usually aren't a problem because of generic
identifier rules. I picked them as an example because they show how
backing up works clearly. The sorts of rules that cause problems are
string constants, escape sequences, floating point representations, and
multi-character operators.


                                Vern
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.