Re: Functional Symbol Tables

kennheinrich@sympatico.ca
Wed, 18 Jul 2007 19:07:32 -0700

          From comp.compilers

Related articles
Functional Symbol Tables sgillespie@bisonofborg.com (bison) (2007-07-09)
Re: Functional Symbol Tables torbenm@app-4.diku.dk (2007-07-13)
Re: Functional Symbol Tables a.t.hofkamp@tue.nl (A.T.Hofkamp) (2007-07-13)
Re: Functional Symbol Tables kennheinrich@sympatico.ca (2007-07-18)
| List of all articles for this month |

From: kennheinrich@sympatico.ca
Newsgroups: comp.compilers
Date: Wed, 18 Jul 2007 19:07:32 -0700
Organization: Compilers Central
References: 07-07-03907-07-048 07-07-049
Keywords: functional, symbols
Posted-Date: 19 Jul 2007 00:43:04 EDT

On Jul 13, 5:05 am, "A.T.Hofkamp" <a.t.hofk...@tue.nl> wrote:
> On 2007-07-09, bison <sgilles...@bisonofborg.com> wrote:
> > I'm trying to figure out a way to implement symbol tables in a
> > functional style, but I can't figure out a way to do that. The only
> > way I can think to do it is to keep track of names during parsing for
> > each scope and upon entering a scope add these bindings to the
> > existing table.
>
> Sorry, but I am not sure that I understand your problem.
>
> Let me explain what I did with symbol tables in the functional-style
> transformation formalism ASF+SDF:
>
> My nested symbol tables are defined like (edited to make it look like BNF).
>
> Symbol-layer ::= "single" "{" Symbol* "}" Symbol-layer
> | "null" "{" "}"
>
> Symbol-layer is a nested list of scopes. At the bottom in a "null" scope
> without symbols (between the curly brackets). The null-scope is prefixed by a
> stack of "single" scopes, each with a list of Symbols (Symbol*) in it.
> (in reality, I have two other forms of scopes, hence the keyword "single").
>
> On this syntax, I defined a number of functions for creating and deleting scopes:
>
> "new-null-layer" -> Symbol-layer
> "new-single-layer" ( Symbol-layer ) -> Symbol-layer
> "del-layer" ( Symbol-layer ) -> Symbol-layer
>
> and some functions for adding and getting symbols to these scopes:
>
> "may-add-symbol" ( Symbol-layer, Symbol ) -> Boolean
> "add-symbol" ( Symbol-layer, Symbol ) -> Symbol-layer
> "get-symbol" ( Symbol-layer, Identifier ) -> Symbol
>
> (slightly simplified w.r.t. reality). Note that Symbol contains the name of
> the symbol, so it can be added to the table without specifying an Identifier.
>
> > I read somewhere that you can pass around your ST through arguments.
>
> With functional programming, you don't have global variables, everything is a
> parameter of the function that is executed.
>
> > My thought about that is that if you are using recursion, you will
> > lose all of your mappings once that call is returns. Does anyone have
> > any thoughts on this?
>
> Euh, I guess so. Normally, you return any interesting results (such as the
> modified symbol table) as return value from the function call, and recurse
> into the next phase, passing the table as argument, like
>
> "compile" ( text ) -> code
>
> could be implemented as (extremely simplified)
>
> [] compile(text) = code
> when ast := parse(text),
> sym := new-single-layer(new-null-layer),
> sym2, ast2 := typecheck(sym, ast),
> code := codegen(sym2, ast2)
>
> ie, to execute "compile(text)", first do "ast := parse(text)", then "sym :=
> new-single-layer(new-null-layer)", etc, until you have done "code :=
> codegen(sym2, ast2)". Finally, return the code by "= code" at the first line.
>
> Each successive call takes arguments returned as results by the previous call.
>
> Does this answer your question?
>
> Albert


The "functional style" requirement opens up two possible sub-
requirements. I'm not sure which is more interesting for the OP. One
sub-requirement is for functional-style immutability, so that the
symbol table loses it's imperative flavour and is no longer treated as
"one and only one" symbol table updated by side effects, but becomes a
series of small updates threaded between calls to parsing procedures.
The other sub-requirement is to have, in general, the ability to
thread a series of symtabs through these parsing procedures. The
former is a data structure question. The latter is a parsing
architecture question. On the data structure side, I'm not sure how
best to reconcile the O(n) nature of a series of small updates with
the O(1) efficiency of a hash table. On the parsing architecture, I've
had success using combinators.


I've used an approach in ML using parser combinators, where the symtab
is threaded through the parsers right along with the stream of lex
tokens, so the parsers operate on a stream of (symtab * token list)
rather than a simple token list. The symtab itself is stacked,
similarly to the one described in a previous post, and there exist
combinators to push and pop scopes, and also to hide and reveal
symtabs (push and pop on another axis) to open e.g. field names in a
record.


Some nice examples of combinators (from memory, so the correct idea is
there if not the correct type signature),


pOpenScope ; consumes no tokens but opens a new layer


pCloseScope ; same but pops a layer


pAutoScope p : (pOpenScope *> p) *> pCloseScope
; where p is a parser returning 'v.
; this wraps p in a new layer, returning (symtab-of-new-decls, 'v),
then continues in original symtab scope


For example, parsing a record declaration, and creating a declaration
(of signature (record id, field sub-sym-table)) becomes ML combinator
code similar to this:


fun pParseRecordDecl (s:stream) =
pBindDecl (pPair (pHeaderAndId, (pAutoScope pFields))) s


In this example combinator (which is really close to the actual ML
code, modulo some private syntactic sugar), the symtabs are threaded
through all the right places almost invisibly.


An excellent combinator reference is a paper "Fast, Error Correcting
Parser Combinators: A Short Tutorial" by Swierstra and Alcocer.


HTH,


  - Kenn


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.