Re: LEX behavior when given "large" automata.

beres@cadnetix.COM (Tim Beres)
Thu, 24 Mar 88 14:25:18 MST

          From comp.compilers

Related articles
Re: LEX behavior when given "large" automata. haddock!uunet!uiucdcs!pur-ee!hankd (1988-03-22)
Re: LEX behavior when given "large" automata. harvard!rutgers!mandrill.cwru.edu!chet@BBN.COM (Chet Ramey) (1988-03-23)
Re: LEX behavior when given "large" automata. beres@cadnetix.COM (1988-03-24)
| List of all articles for this month |

Date: Thu, 24 Mar 88 14:25:18 MST
From: beres@cadnetix.COM (Tim Beres)
Newsgroups: comp.compilers,comp.lang.c,comp.unix.questions
Summary:
Expires:
References: <911@ima.ISC.COM> <914@ima.ISC.COM> <917@ima.ISC.COM> <919@ima.ISC.COM>
Organization: Cadnetix Corp., Boulder, CO

In article <919@ima.ISC.COM> haddock!uunet!uiucdcs!pur-ee!hankd (Hank Dietz) writes:
>In article <917@ima.ISC.COM>, sargas.usc.edu!tli@oberon.usc.edu (Tony Li) writes:
>> In fact, another cute trick is to toss in a simple hashing function.
>> Unless you've got lots of keywords, you usually can get away with
>> doing only one strcmp.
>
>I'm very pleased to see many people confirming that what I've
>done and told my students to do is reasonably widely accepted
>(despite not appearing in any compiler textbook I know of)...


We use the symbol lookup "trick" for our lex as well. But instead of
hashing we use a set of balanced B-Trees to store our symbols.
We keep a tree for each length string, up to some max length.
Works reasonably well if your symbols are somewhat spread over
the length spectrum...anyway symbol lookup is not our bottleneck.


One more note - I've found that I can exceed the lex defaults when using
a rule per symbol (lots of rules and symbols); a nice touch of using
a symbol lookup method is reduction in lex usage [It is possible to
increase lex table sizes, the %x nnn control, but I prefer to limit
my lex because it becomes simpler and easier to debug with fewer rules.
Also, the symbol lookup method is a nice model when used in conjuntion
with the parsers we need.]
--
Tim Beres Cadnetix 303/444-8075 x221
                      5775 Flatirons Pkwy {uunet,boulder,nbires}!cadnetix!beres
                      Boulder, CO 80301 beres@cadnetix.com
<disclaimers: my thoughts != Cadnetix's>
[It's hard to believe that storing a symbol table in a B-tree is worth the
effort; B-trees make sense for disk files since looking at a page is
relatively expensive, but for an in-core symbol table the O(1) lookup time
of hashing should be better. It's also a lot easier to program. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.