|Hashtable alternatives email@example.com (Gwynfa) (2000-07-27)|
|Re: Hashtable alternatives firstname.lastname@example.org (2000-08-04)|
|perfect hashing email@example.com (Preston Briggs) (2000-08-04)|
|Re: perfect hashing firstname.lastname@example.org (Tzvetan Mikov) (2000-08-05)|
|Re: perfect hashing email@example.com (Jan Gray) (2000-08-09)|
|Re: perfect hashing firstname.lastname@example.org (2000-08-10)|
|Re: perfect hashing email@example.com.OZ.AU (2000-08-10)|
|Re: perfect hashing firstname.lastname@example.org (Parzival) (2000-08-10)|
|Re: perfect hashing email@example.com (2000-08-13)|
|Re: perfect hashing firstname.lastname@example.org (Lars Duening) (2000-08-13)|
|[4 later articles]|
|From:||"Tzvetan Mikov" <email@example.com>|
|Date:||5 Aug 2000 21:28:59 -0400|
|References:||00-07-064 00-08-022 00-08-026|
Preston Briggs <firstname.lastname@example.org> wrote in message
> The moderator wrote:
> [Perfect hashing is sometimes useful for a fixed table of keywords, but
> you can't use it for a normal symbol table. -John]
> Indeed, I wouldn't use it for keywords.
> Consider the typical scanner.
> a) Recognize that we have some sort of keyword or identifier
> b) See if it's a keyword
> c) If not, look it up in the symbol table
> If we use perfect hashing for step b, then we'll _always_ find
> some entry and have to finish with a string comparison.
We could avoid the string comparison in many cases. We calculate an
additional hash value and store it in the perfect hash table as well
as the string itself. Then we need to compare the strings only if the
hash value matches. Since the additional hash function is different
from the perfect hashing function we won't get many false positives
and the string comparison will just confirm the match.
Calculating the additional hash value is essentially free if the
symbol table for the identifiers uses the same hash function - it has
to be calculated anyway.
> [I agree. I've fooled around with perfect hash generators, but in
> it makes more sense to load the keywords into the same symbol table as the
> regular symbols so you can look up every text token in one try. -John]
I guess that would be faster, but it seems "cleaner" not to mix the
reserved words and the symbol table together. Going down this road in
a C compiler we could even use the same symbol table for preprocessor
defines as well ( has that been done?). Is that a good idea?
[Seems pretty clear to put all the token info in one table. -John]
Return to the
Search the comp.compilers archives again.