Re: Looking for symbol table hashing algorithm

Paul Dietz <dietz@interaccess.com>
5 Jul 1998 21:33:01 -0400

          From comp.compilers

Related articles
Looking for symbol table hashing algorithm wal@cosc.canterbury.ac.nz (Warwick Irwin) (1998-07-03)
Re: Looking for symbol table hashing algorithm dwight@pentasoft.com (1998-07-03)
Re: Looking for symbol table hashing algorithm drunelson@my-dejanews.com (1998-07-03)
Re: Looking for symbol table hashing algorithm jmccarty@sun1307.spd.dsccc.com (1998-07-03)
Re: Looking for symbol table hashing algorithm zs@cs.mu.OZ.AU (1998-07-05)
Re: Looking for symbol table hashing algorithm chase@naturalbridge.com (David Chase) (1998-07-05)
Re: Looking for symbol table hashing algorithm dietz@interaccess.com (Paul Dietz) (1998-07-05)
Re: Looking for symbol table hashing algorithm khays@sequent.com (1998-07-05)
Re: Looking for symbol table hashing algorithm eodell@pobox.com (1998-07-08)
Re: Looking for symbol table hashing algorithm preston@tera.com (1998-07-08)
Re: Looking for symbol table hashing algorithm drh@microsoft.com (Dave Hanson) (1998-07-08)
Re: Looking for symbol table hashing algorithm smith@aragorn.uni-trier.de (1998-07-10)
Re: Looking for symbol table hashing algorithm scott@basis.com (1998-07-10)
[6 later articles]
| List of all articles for this month |

From: Paul Dietz <dietz@interaccess.com>
Newsgroups: comp.compilers
Date: 5 Jul 1998 21:33:01 -0400
Organization: The World's Usenet -- http://www.Supernews.com
References: 98-07-030 98-07-045
Keywords: symbols

> Warwick Irwin wrote:
> >
> > Can anyone point me to a reference for an industrial-strength hashing
> > algorithm suitable for implementing a symbol table in a C++ parser?
> > I would like something that has been tuned for large C++ programs.


David Chase wrote:
> Do you want a hash function, or a hashing algorithm?
>
> There was an article, I think in CACM, that spoke well of a
> table-based algorithm. Simply fill an array of 256 integers with
> random numbers, and use an algorithm along the lines of:


[deleted]


He might want to focus more on algorithms that handle several bytes at
a time. Arrange the identifiers to be long word aligned (and,
preferably, aligned with the start of a cache line) and padded with
null bytes to an integral number of long words.


A hash function like


      h(x_1 ... x_n) = (a_1 x_1 + ... + a_n x _n) mod m


(where the a_i are randomly chosen constants and x_i are the long
words of the identifier) might be efficiently computed if
multiplication were fast or were pipelined.


Paul
[Is the hash really likely to be a hot spot in the compiler? Also, it is
my distinct impression that most identifiers are pretty short in languages
like C, C++, and Fortran. -John]


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.