Re: A handful of LISP questions

Nils M Holm <>
Tue, 19 Jun 2007 08:15:48 +0200

          From comp.compilers

Related articles
A handful of LISP questions (tactics) (2007-06-19)
Re: A handful of LISP questions (Gene) (2007-06-19)
Re: A handful of LISP questions (Nils M Holm) (2007-06-19)
Re: A handful of LISP questions (Martin Rodgers) (2007-06-19)
Re: A handful of LISP questions (Lieven Marchand) (2007-06-19)
Re: A handful of LISP questions (Andreas Hinze) (2007-06-19)
Re: A handful of LISP questions (tactics) (2007-06-20)
Re: A handful of LISP questions (Gene) (2007-06-20)
| List of all articles for this month |

From: Nils M Holm <>
Newsgroups: comp.compilers
Date: Tue, 19 Jun 2007 08:15:48 +0200
Organization: Compilers Central
References: 07-06-033
Keywords: Lisp, parse
Posted-Date: 19 Jun 2007 10:27:28 EDT

On 2007-06-19, tactics <> wrote:
> My first question, and the most important to me for now, I think, is
> what is the best way to write a parser for a LISP? It was only out of
> the grace of the C Gods that I got my current parser working. I have a
> nice method for breaking a raw string into tokens (which I was cute
> about, and instead of returning an array of tokens, it returns a
> cons'ed list of C structure LISP-objects). But once I have the tokens,
> I do some pretty bad black magic to get the final list structure. [...]

Most LISPs do not have a full-blown parser, but a so-called reader,
which is typically implemented in the READ procedure. The same
procedure that reads S-expressions is also used to read complete LISP

Converting text to tokens seems to be a redundant step to me. Once
you have read the text of a token, you know what it is, and you can
generate the internal representation in an instant. LISP variables
have no types, so when you read a symbol, you generate a symbol.
When you read a number, you generate a number. To read the members
of a list you call READ recursively.

> I was wondering from a theoretical standpoint, what kind of grammar is
> a LISP grammar?

sexpr := atom
              | list

list := '(' members ')'
            | '(' ')'

members := sexpr
                  | sexpr members

atom := symbol
            | number
            | string
            ... add more atoms to your taste ...

> Obviously, it's a straightforward CFG, but is it LL(1)?

When you refactor 'members' and 'list', the grammar is LL(1). In fact,
LISP is so easy to read (by a reader procedure) that I have never
bothered to write a lexer or a parser for it. For example, a READ
procedure for Scheme may be implemented in a few hundred lines of
C. See [1] for some actual code.

> My next question is about garbage collection. I have a nice mark and
> sweep algorithm written, but the big issue here is I don't know when
> to actually DO garbage collection. Right now, I just call it at the
> end of my program, but that is hardly useful. Do I just run it
> periodically in a separate thread? Do I set a recurring signal timer?
> In either case, what is the best way to ensure I still have access to
> the current environment?

Mark and sweep is a good choice for a simple LISP. Just make sure
that you use a constant-space version, so that it can deal with long
or deep structures gracefully.

A naive approach would be to run the GC when you run out of memory.
A much better idea would be to run it when your memory pool runs
low (but not dry). This is because when you wait until you are out
of memory, collections occur at a quickly increasing rate, which
degrades performance dramatically.

Concurrent GC is only useful if you do not want the collector to
interrupt program execution. See [2] for an extensive coverage of GC

[2] Jones, Lins
        "Garbage Collection"
        Wiley & Sons, 1996

Nils M Holm <> --
[Discussions of garbage collection would be more appropriate on gclist.
See -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.