|Masters course with compiler specialization email@example.com (Jeremy Wright) (2002-11-12)|
|Re: Masters course with compiler specialization Trevor.Jenkins@suneidesis.com (2002-12-11)|
|Re: Masters course with compiler specialization firstname.lastname@example.org (2002-12-19)|
|Size of hash tables was Re: Masters course with compiler specializat Trevor.Jenkins@suneidesis.com (2002-12-22)|
|Re: Size of hash tables was Re: Masters course ... email@example.com (Joachim Durchholz) (2002-12-30)|
|Re: Size of hash tables was Re: Masters course ... firstname.lastname@example.org (Matthias Neeracher) (2003-01-04)|
|Re: Size of hash tables was Re: Masters course ... email@example.com (2003-01-04)|
|Re: Size of hash tables was Re: Masters course ... firstname.lastname@example.org (Stephan Eggermont) (2003-01-07)|
|From:||email@example.com (Paolo Bonzini)|
|Date:||4 Jan 2003 22:47:23 -0500|
|References:||02-11-060 02-12-056 02-12-092 02-12-107 02-12-127|
|Posted-Date:||04 Jan 2003 22:47:23 EST|
> > Since the publication of Maurer's paper "An improved hash code for
> > scatter storage" in the Comm of the ACM (vol 11, Jan 1968, pp 35--38)
> > it is taken as gospel that hash tables only work if the size is a
> > prime number.
> Not "only work". Just "distribute their keys in a more random fashion,
> assuming you don't have a priori knowledge about key distribution".
You can always design a good-enough hash function so that the
distribution of the hash-function does not require the additional
bit-shuffling that a modulo-prime operation does. Considering strings
as keys, you can say that prime sizes are only required when a
character only affects a few specific and clustered bits, instead of
having some chance of affecting every bit.
> Particularly on modern hardware, where division and bit masking have
> roughly the same execution cost. Could anybody clarify?
Only roughly (and only if you can parallelize things well -- a
division has some cycles of latency, while you can do several
bitmaskings in the same clock cycle). Why do things roughly if it's
simple to do them better? :-)
You can also do double-hashing with power-of-two hash tables, only
make sure that you set the low bit of the secondary hash. e.g. size is
8, primary hash is 3, secondary hash is 4 --> does not work, you only
examine two items; but if secondary hash is 5, you examine eight
buckets (in this order: 3, 0, 5, 2, 7, 4, 1, 6).
Return to the
Search the comp.compilers archives again.