Another backtracking parser generator: ANTLR.

quong@merlot.ecn.purdue.edu (Russell W Quong)
Thu, 30 Mar 1995 18:49:24 GMT

          From comp.compilers

Related articles
Another backtracking parser generator: ANTLR. quong@merlot.ecn.purdue.edu (1995-03-30)
| List of all articles for this month |

Newsgroups: comp.compilers
From: quong@merlot.ecn.purdue.edu (Russell W Quong)
Keywords: parse, PCCTS, LL(1)
Organization: Purdue University Engineering Computer Network
Date: Thu, 30 Mar 1995 18:49:24 GMT

Along the subject of parser generators that produce backtracking
parsers, please be aware of the current release (1.31) of ANLTR,
which was developed by Terence Parr and myself.


ANTLR
  - produces recursive-descent LL(k) parsers. In practice,
      k is limited to 3 or 4 depending on the grammar size.
          * allows the use of "predicates" to direct the parse.
          * Backtracking is goes under the name "syntactic predicate"


  - produces recursive-descent LL(k) parsers, which are easy to use/debug.
          * generates C++ code in the form of C++ parser classes,
                from which the user can subtype
          * it can also generate C code, though this option is showing its age.


  - has been a public-domain tool since the early 1990's,
      and is perhaps the most widely parser-generator after yacc/bison.


  - lexical analyzer support built in (no need for separate lexer file)


  - is part of the Purdue Compiler Construction Toolset (PCCTS)


  - has a newsgroup: comp.compilers.tools.pccts


  - has been ported to the PC/other platforms (see the newsgroup)


  - has an FTP-SITE = ftp://ftp.parr-research.com/pub/pccts


  - has technical papers
          * a user's manual
                  [ in FTP-SITE/documentation/manual.ps ]
                  (this manual is showing its age, but a book is being written).
          * a conference (1994 Intl Conf on Comp Constr) paper on predicates
     "Adding Semantic and Syntactic Predicates To LL(k): pred-LL(k)"
                  [ in FTP-SITE/papers/predicates.ps.Z ]
          * a soon-to-appear article in Software Practice and Experience,
     "ANTLR: A Predicated-LL(k) Parser Generator"
                  ( which we cannot give out until the article appears )


In both the conf. paper and SPE article, we give examples of how to
use predicates and backtracking. In particular, we specifically show
how get around the "lexer feedback hack" and how to parse C++
expressions and declaration via backtracking, as Chris Dodd mentioned
in his btyacc posting (excerpted below).


================================================================


> From: Chris Dodd <dodd@csl.sri.com>
> Subject: BTYACC -- yacc with backtracking and inherited attributes.
> < material omitted >
> BTYACC is a modified version of yacc that supports automatic
> backtracking and semantic disambiguation to parse ambiguous grammars, as
> well as syntactic sugar for inherited attributes (which tend to
> introduce conflicts). Whenever a btyacc generated parser runs into a
> shift-reduce or reduce-reduce error in the parse table, it remembers
> the current parse point (yacc stack and input stream state), and goes
> into trial parse mode. It then continues parsing, ignoring most rule
> actions. If it runs into an error (either through the parse table or
> through an action calling YYERROR), it backtracks to the most recent
> conflict point and tries a different alternative. If it finds a
> successful parse (reaches the end of the input or an action calls
> YYVALID), it backtracks to the point where it first entered trial parse
> mode, and continues with a full parse (executing all actions),
> following the path of the successful trial.
> < material omitted >
> * No more lexer feedback hack. In yacc grammars for C, a standard hack,
> known as the `lexer feedback hack' is used to find typedef names. The
> lexer uses semantic information to decide if any given identifier is
> a typedef-name or not and returns a special token. With btyacc, you
> no longer need to do this; the lexer should just always return an
> identifier. The btyacc grammar then needs a rule of the form:
>
> typename: ID [ if (!IsTypeName(LookupId($1))) YYERROR; ]


In ANTLR our grammar syntax is
    typename: << IsTypeName(LATEXT(1)) >>? ID


> * Easy disambiguation via simple ordering. Btyacc runs its trials via
> the rule ``try shifting first, then try reducing by the order that the
> conflicting rules appear in the input file''. This means you can deal
> with semantic a disambiguation rule like:
> [1] If it looks like a declaration it is, otherwise
> [2] If it looks like an expression it is, otherwise
> [3] it is a syntax error
> [Ellis & Stroustrup, Annotated C++ Reference Manual, p93]


In ANTLR:
    statement: (declaration)? declaration
| expression
;


Here (declaration)? means to try parsing declaration, returning
true or false. Note that the user must explicitly specify
backtracking via a syntactic predicate.


Please see the newsgroup comp.compilers.tools.pccts for further information.


Russell
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.