Re: A minimal LL(1) parser generator ? (Anton Ertl)
Thu, 26 Dec 2019 16:21:29 GMT

          From comp.compilers

Related articles
A minimal LL(1) parser generator ? (Andy) (2019-12-21)
Re: A minimal LL(1) parser generator ? (2019-12-22)
Re: A minimal LL(1) parser generator ? (2019-12-26)
Re: A minimal LL(1) parser generator ? (2019-12-29)
Re: A minimal LL(1) parser generator ? (2019-12-31)
Re: A minimal LL(1) parser generator ? (2019-12-31)
Re: A minimal LL(1) parser generator ? (2020-01-01)
Re: A minimal LL(1) parser generator ? (honey crisis) (2020-01-02)
Re: A minimal LL(1) parser generator ? (2020-01-02)
[11 later articles]
| List of all articles for this month |

From: (Anton Ertl)
Newsgroups: comp.compilers
Date: Thu, 26 Dec 2019 16:21:29 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
References: 19-12-016
Injection-Info:; posting-host=""; logging-data="96917"; mail-complaints-to=""
Keywords: LL(1), tools
Posted-Date: 26 Dec 2019 12:12:38 EST

Andy <> writes:
>ANTLR has even LL(*) but is too complicated. I am searching maximal
>simple and elegant generator which generates function call like
>written by hand.

Sounds like you want a generator that generates a recursive-descent
parser. A while ago I compared a number of parser generators
[ertl99ef], and among those that generate recursive-descent parsers
the shortest one by far was Gray
<>. Whether you
consider it simple and elegant is for you to decide.

    author = {M. Anton Ertl},
    title = {Is {Forth} Code Compact? {A} Case Study},
    booktitle = {EuroForth'99 Conference Proceedings},
    year = {1999},
    address = {St. Petersburg, Russia},
    url = {},
    abstract = {Forth advocates often claim that Forth code is
                                    smaller, faster, and requires less development time
                                    than equivalent programs in other languages. This
                                    paper investigates this claim by comparing a number
                                    of parser generators written in various languages
                                    with respect to source code size. The smallest
                                    parser generator (14 lines) in this comparison is
                                    written in Forth, and the other Forth program is
                                    smaller than the others in its class by a factor of
                                    8 or more; however, the Forth programs do not have
                                    all the features of their counterparts. I took a
                                    closer look at Gray (in Forth) and Coco/R (in
                                    Modula-2) and found that several Forth features
                                    missing from Modula-2 give Gray more than a factor
                                    of three advantage over Coco/R (even if the other
                                    size differences were solely due to differences in
                                    functionality): run-time code generation; access to
                                    the parser and a simple, flexible syntax; and
                                    Forth's dictionary.}

Concerning the other question that came up, about dealing with left
recursion, and building the abstract syntax tree (AST) appropriately,
that part works nicely in Gray:

In BNF you would write

expr->expr '-' number

(and the parse tree would have the same structure as the AST).

Gray uses a variant of regular right part grammars, and you would
write that as a pure parsing rule as follows:

(( number (( '-' number )) ** )) <- expr

For buiding the AST, let's assume that "number" leaves a tree node on
the stack, and we have a word

new-bin-node ( node1 node2 -- node3 )

(i.e., it pops node1 and node2 from the stack, and pushes node3).
Then we can extend the above as follows:

(( number (( '-' number {{ new-bin-node }} )) ** )) <- expr ( -- node )

(The "( -- node )" comment documents the stack effect of parsing "expr").

Passing the nodes on Forth's stack rather than using pseudo-variables
like yacc's $1 etc. works nicely when going beyond BNF. Now I wonder
how Algol-based EBNF parser generators do it, but am too lazy to
actually research it.

Getting a right-recursive tree is actually slightly harder: Now we
actually have to perform recursion (non-recursive variants exist, but
are not simpler). Let's have a similar example, but with the
right-associative operator '^'.

nonterminal expr ( -- node ) \ declare before use
(( number (( '^' expr {{ new-bin-node }} )) ?? )) expr rule

- anton
M. Anton Ertl

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.