Re: C symbol table structures

"Gene Ressler" <resslere@erols.com>
17 Jan 1999 20:52:28 -0500

          From comp.compilers

Related articles
C symbol table structures mac@coos.dartmouth.edu (1999-01-02)
Re: C symbol table structures kadhim@lucent.com (Basim Kadhim) (1999-01-04)
Re: C symbol table structures heinrich@idirect.com (1999-01-06)
Re: C symbol table structures drh@microsoft.com (Dave Hanson) (1999-01-11)
Re: C symbol table structures resslere@erols.com (Gene Ressler) (1999-01-17)
| List of all articles for this month |

From: "Gene Ressler" <resslere@erols.com>
Newsgroups: comp.compilers
Date: 17 Jan 1999 20:52:28 -0500
Organization: Compilers Central
References: 99-01-005
Keywords: symbols

I have tried this kind of "shallow binding," a "stack of symbol
tables," and a third approach that led to the cleanest and fastest
code, at least in an environment where inherited attributes were
suppored for the parser (sorry yacc'ers).


I suppose you'd say it's "Lispish" or "functional". All the info
normally kept in the symbol table except the id string goes into IR
nodes, in my case AST "declaration" nodes. A symbol table node is
just an id and a pointer to such a node. Each declaration node points
to its parent "block" node (procedure, function, scope block, etc.);
each block to its symbol table. A symbol table is a dictionary of
symbol table nodes, a pointer to the table of the enclosing scope
(null for outermost), and a pointer to its block. These pointers seem
adequate to do all the usual lookups quickly. If you are pressed for
space, you can allocate from arrays and use 2 byte node numbers rather
than 4 byte pointers.


I know all this is pretty standard. The thing I've not seen before is to
use the symbol table as an explicit inherited attribute param in the parser
rather than fooling around with a stack. So in a top-down parser you get
procedures like


parse_function_declaration(ST st1) {
    ST st2 = parse_type(st1); // Parse function return type (may add new struct!)
    match(IDENTIFIER, st2); // Get the function name into the symtab
    ST st3 = new ST(st2); // Make a new table and chain it onto the old one.
    parse_params(st3); // New block so get params into the new table!
    parse_body(st3); // Still in new block; use new table
    parse_rest_of_decls(st2); // Parse the rest with the old table
}


I.e. the code for managing symbols tends to read like the language
spec. You avoid an explicit stack, so you don't have to remember to
pop it. Notwithstanding intuition, a good compiler optimizes away
much of the st param passing.


The kicker as with all "one st per scope" solutions is coming up with
a st structure that is very small when storing small numbers of IDs
(like struct scopes).


Gene


Alex Colvin wrote in message 99-01-005...
>I've been working on a compiler for C* (a dialect of C).
>
>I'm using a symbol table structure that I think I heard about here
>(comp.compilers) long ago. Can anyone tell me if this has a name, or if
>i got it right?
>
>Each scope (block) is numbered, in increasing order. Each identifier
>has a list of definitions and scope numbers.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.