|Parsing a simple BASIC language firstname.lastname@example.org (paul.dunn4) (2001-04-04)|
|Re: Parsing a simple BASIC language email@example.com (Barry Kelly) (2001-04-10)|
|Re: Parsing a simple BASIC language firstname.lastname@example.org (2001-04-10)|
|Re: Parsing a simple BASIC language email@example.com (2001-04-12)|
|Re: Parsing a simple BASIC language firstname.lastname@example.org (Dunny) (2001-04-12)|
|Re: Parsing a simple BASIC language email@example.com (Barry Kelly) (2001-04-14)|
|Re: Parsing a simple BASIC language firstname.lastname@example.org (2001-04-18)|
|Re: Parsing a simple BASIC language email@example.com (2001-04-18)|
|Re: Parsing a simple BASIC language firstname.lastname@example.org (Dunny) (2001-04-22)|
|Re: Parsing a simple BASIC language email@example.com (Barry Kelly) (2001-04-22)|
|From:||"Barry Kelly" <firstname.lastname@example.org>|
|Date:||14 Apr 2001 17:20:41 -0400|
|Organization:||Ireland On-Line Customer|
|References:||01-04-014 01-04-039 01-04-074|
|Posted-Date:||14 Apr 2001 17:20:41 EDT|
> I'm also writing a Scanner and Parser in Delphi [...] I use a
> recursive descendant one to parse a small toy-language [...]
> I chose not to use a LL(1) grammar because
> I have to look ahead two tokens in some productions.
I'd use context if I were you - if you look ahead two tokens, you'll
need some kind of 'push back', and that can make coding more
awkward. But, it's an implementation detail...
> I use a small fixed-size token stack to buffer the token stream so
> it's as fast as it can get. Of course I could do a LL(1) grammar but
> why making the grammar harder to read when this way works and is
> fast enough?
All you usually need is 'magic' identifier tokens
> Yes, it is faster than my approach,
Unfortunately, speed is a big concern of mine; I'm writing a
dynamically compiling expression evaluator, that essentially produces
a function pointer from a string containing an expression, designed to
be applicable to graphics (matrix maths) and spreadsheet
applications. While multiple iterations will amortize the cost of
upfront expression generation, to compete with simpler interpreted
systems (like parser10, you'll find it on most Delphi archives) I'm
trying to get it as fast as possible.
Delphi, being an amazingly fast compiler, inspires one to try hard, too.
> but mine is definitely cleaner.
> I use something like the record below for my tokens which greatly
> aides to make a decent error reporting,
My lexer has this information up front, and I only look ahead one
token, so I don't need it. The main reason I used my 'TToken =
#0..#255' hack is because Object Pascal didn't support overloaded
functions when I first wrote it. IOW, historical reasons.
> >One final note for speed; make sure you're using a hash table for lookup.
> >I've got a fairly fast hashtable on
> I'm going to give it a try, since my own hash table implementation sucks
> pretty much, as it was written in about half an hour.
> Btw, is there
> something like gperf for pascal?
You could translate the output of gperf; AFAICS it is driver for a
table, so you could write a quick text extracter in Perl to remap it
for a Pascal unit.
I would like to point out though, that I can't see the point in using
it. The more sparse your hash table for looking up keywords is, the
faster it will be at rejecting non-matches (since every successful
look-up in a hash table has to be followed up by a byte-for-byte
comparison), and it will also almost certainly be unique (since it
will be larger). I think it would be faster to merge your keyword data
with your global symbol table, on the basis that a sparse table will
return non-matches much faster.
Return to the
Search the comp.compilers archives again.