Re: Speeding up LEX scanning times

mercier@hollywood.cinenet.net (Bob Mercier)
Fri, 3 Feb 1995 18:55:02 GMT

          From comp.compilers

Related articles
Speeding up LEX scanning times pahint@eunet.be (1995-02-02)
Re: Speeding up LEX scanning times dimock@das.harvard.edu (1995-02-02)
Re: Speeding up LEX scanning times c1veeru@WATSON.IBM.COM (Virendra K. Mehta) (1995-02-02)
Re: Speeding up LEX scanning times monnier@di.epfl.ch (Stefan Monnier) (1995-02-03)
Re: Speeding up LEX scanning times mercier@hollywood.cinenet.net (1995-02-03)
Re: Speeding up LEX scanning times vern@daffy.ee.lbl.gov (1995-02-04)
Re: Speeding up LEX scanning times eifrig@beanworld.cs.jhu.edu (1995-02-07)
| List of all articles for this month |

Newsgroups: comp.compilers
From: mercier@hollywood.cinenet.net (Bob Mercier)
Keywords: lex, Cobol, performance
Organization: Cinenet Communications,Internet Access,Los Angeles;310-301-4500
References: 95-02-010
Date: Fri, 3 Feb 1995 18:55:02 GMT

Pieter Hintjens (pahint@eunet.be) wrote:
: I'm writing a Cobol parser, using MKS Lex and Yacc. So far so good.
: However, on seriously large programs, it is quite slow. When I profiled
: the code, I noticed that about 80% of the time was in the Lex scanner.
: Now, I found that the standard C functions for file access (fread) are
: a lot slower than the non-standard read functions, so I shaved off
: some time by using these if the compiler supports them.


: However, I still find that the scanner is slow. I don't think I made
: any mistakes; for instance all keywords are identified by looking-up
: a table, rather than as individual scanner tokens.


Are you saying that you use lex to collect an ID and then test
it against some keyword table? If so this is going to be slower
than just giving lex the list of keywords:


"if" return tIF;
"else" return tELSE;


: So my question is: should I consider writing the scanner by hand,
: now that I have a working prototype? If so, are there any techniques
: I should be aware of?


There is a great tool called 're2c', you should be able to find it
in archie. It's a slightly harder to use then lex but about twice
as fast as even flex in it's best speed mode. Like lex it builds
a DFA out of the strings it tries to match; lex spits out tables
describing the DFA and emits code to walk the tables as it scans
its input. re2c emits 'C' code directly matching the finite state
machine described by the DFA. It also allows a little more flexibility
in hooking into your i/o system, esp if you can mmap files.


Bob
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.