|[6 earlier articles]|
|Re: perfect hashing email@example.com.OZ.AU (2000-08-10)|
|Re: perfect hashing firstname.lastname@example.org (Parzival) (2000-08-10)|
|Re: perfect hashing email@example.com (2000-08-13)|
|Re: perfect hashing firstname.lastname@example.org (Lars Duening) (2000-08-13)|
|Re: perfect hashing email@example.com (2000-08-13)|
|Re: perfect hashing firstname.lastname@example.org (Tim Josling) (2000-08-13)|
|Re: perfect hashing email@example.com (2000-08-13)|
|Re: perfect hashing intmktg@Gloria.CAM.ORG (Marc Tardif) (2000-08-13)|
|Date:||13 Aug 2000 19:07:25 -0400|
|Organization:||Deja.com - Before you buy.|
|References:||00-07-064 00-08-022 00-08-026 00-08-062|
> > Indeed, I wouldn't use it for keywords.
> > Consider the typical scanner.
> > a) Recognize that we have some sort of keyword or identifier
> > b) See if it's a keyword
> > c) If not, look it up in the symbol table
> Step a) is done by the lexical analyser, and if the token class
> 'identifier' excludes reserved keywords, or 'identifier's which are
> also keywords are lexed as keywords, then the job of recognizing
> keywords distinct from identifiers is complete in step a. This
> eliminates step b, and avoids a perfect hasher too.
Assuming (c) requires a hash value, if you calculate the hash for all
identifiers in (a) you eliminate the loop overhead of calculating the
hash for (c). That's something like three instructions per character.
My current favorite ASCII-character-at-a-time hash is
hash = (hash ^ current_character) + ((hash<<26)+(hash>>6));
- Bob Jenkins
Return to the
Search the comp.compilers archives again.