Re: Symbol tables and scopes

"David R Tribble" <david@tribble.com>
2 Feb 2006 11:47:37 -0500

          From comp.compilers

Related articles
Symbol tables and scopes DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-01-28)
Re: Symbol tables and scopes david@tribble.com (David R Tribble) (2006-02-02)
Re: Symbol tables and scopes cfc@shell01.TheWorld.com (Chris F Clark) (2006-02-03)
Re: Symbol tables and scopes Peter_Flass@Yahoo.com (Peter Flass) (2006-02-03)
Re: Symbol tables and scopes gdr@integrable-solutions.net (Gabriel Dos Reis) (2006-02-06)
Re: Symbol tables and scopes DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-02-06)
Re: Symbol tables and scopes DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-02-06)
Re: Symbol tables and scopes gah@ugcs.caltech.edu (glen herrmannsfeldt) (2006-02-06)
[17 later articles]
| List of all articles for this month |

From: "David R Tribble" <david@tribble.com>
Newsgroups: comp.compilers
Date: 2 Feb 2006 11:47:37 -0500
Organization: http://groups.google.com
References: 06-01-101
Keywords: symbols
Posted-Date: 02 Feb 2006 11:47:36 EST

Hans-Peter Diettrich (DoDi) wrote:
> Currently I want to implement an efficient symbol table, for
> navigation in some source code. This means that searching the symbol
> table(s) should be sufficiently fast, for use in an interactive
> application.
>
> In my latest parser I used a single global symbol table, implemented
> as a stack with clever hashing. Local symbols are pushed onto the
> stack, and are removed again when a scope is closed (LiFo). The hash
> function only has to return the last added matching symbol first, so
> that the time for the lookup of any symbol is affected only by the
> number of symbols with the same hash code.
>
> In an IDE instead a tree of scopes must be used, and a symbol must be
> searched in all scopes, from the current (innermost local) scope up to
> the global scope in the root of the tree. Any suggestions on the
> implementation of such a data structure, or should I stay with my
> existing approach, extended by a search through multiple symbol
> tables?


Off the top of my head, two approaches come to mind. The first is to
maintain a stack of scopes, and use a hash table for each symbol table
within each scope. Obviously, as the scope gets more nested, symbol
searches will take longer.


The second approach is to maintain one hash table for all symbol
names, and as a new scope is entered, its symbols are added to the
hash table; any coliisions with previously existing symbols (i.e.,
symbol names occuring in outer scopes) can be chained together, adding
the inner scoped name to the front of the chain. Thus searching for a
given symbol name is just as fast regardless of scoping depth
(depending only on the total symbol hash table size), and finding a
matching name retrieves the symbol within the innermost scope in which
it is visible (because it is the first symbol in each chain).


The difficulty with this approach is keeping track of the scope of
each name, so that when the scope is exited, all of its symbols can be
efficiently removed from the hash table. If you don't mind the extra
space, a new hash table can be built upon entering a new scope, adding
all the new symbols of the new scope on top of the symbols already in
the table from outer scopes, and pushing the previous (outer scoped)
hash table onto a stack. Leaving a scope then simply pops the
innermost hash table off the stack, restoring the hash table of the
previous scope.




> And how to implement namespaces?
>
> IMO namespaces have been "invented" to reduce the amount of
> concurrently used symbols, what's fine in a sequential parser for
> single source modules, but what doesn't help much in a development
> environment, where all used namespaces must be accessible at the same
> time. A separation of every symbol table into namespaces will increase
> the number of tables to search, with little benefits. The ability to
> load an namespace on demand will add the time for the disk access to
> the search time, what can make such an approach unusable in
> interactive applications :-(


Namespaces can also form a nested/stacked heirarchy (such as in C++),
so you have to deal with both multiple namespaces and nested spaces
within each of those spaces.


Again, some kind of single hash table having similar names chained
together, ordered by scope, may be the faster approach.


-drt
http://david.tribble.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.