|[9 earlier articles]|
|Re: Hash specifics email@example.com (1990-12-13)|
|Re: Hash specifics firstname.lastname@example.org (1990-12-14)|
|Re: Hash specifics email@example.com (1990-12-14)|
|Re: Hash specifics plx!plxsun!evan@Sun.COM (1990-12-15)|
|Re: Hash specifics firstname.lastname@example.org (1990-12-16)|
|Re: Hash specifics schmidt@oberkampf.ICS.UCI.EDU (Doug Schmidt) (1990-12-17)|
|Re: Hash specifics email@example.com (1990-12-17)|
|Re: Hash specifics firstname.lastname@example.org (1990-12-18)|
|Re: Hash specifics email@example.com (Ron Pfeifle) (1990-12-21)|
|From:||firstname.lastname@example.org (Preston Briggs)|
|Organization:||Rice University, Houston|
|References:||<1990Dec12.email@example.com> <2980@lupine.NCD.COM> <EVAN.90Dec15155336@plxsun.uucp>|
|Date:||Mon, 17 Dec 90 15:58:58 GMT|
plx!plxsun!evan@Sun.COM (Evan Bigall) writes:
>The time when I always find myself looking for a perfect hash function, is
>when I am parsing long keyword options (ie: -ansi etc...) or some other case
>when I have a small set of possibilities but want to avoid using:
> if (!strcmp("hello", arg))
> else if (!strcmp("there", arg))
> else if (!strcmp("world", arg))
Another poster (Ira Baxter?) mentioned how to handle these efficiently
in a typical compiler-like case. I'll expand a little.
Since we're building a symbol table anyway, we're going to need to hash every
identifier anyway. I use one hash table to hash keywords, identifiers, and
strings as part of my scanner. Each time a new id or string is encountered,
I allocate a new integer for it. If the integer is < some small number, an
id is actually a keyword.
The rest of the front-end simply looks at integers which serve as indices
into a string table. Makes it easy to switch on "strings" when all are
reduced to integers.
This works especially well with Knuth's WEB system. WEB has a notion of
When Tangling, a string table is maintained and all the strings are replaced
in the output text with integers. Further, the string table is written to a
pool file, suitable for initializing your compiler's hash table (or TeX or
Pre-processed strings are also a nice way to handle error messages and so
forth, potentially saving much storage.
LISPers will wonder what the fuss is about, as they use symbols as a matter
Return to the
Search the comp.compilers archives again.