Re: A C++ Parser toolkit (Terence J Parr)
Mon, 12 Apr 1993 15:25:55 GMT

          From comp.compilers

Related articles
A C++ Parser toolkit moudgill@cs.cornell.EDU (Mayan Moudgill) (1993-04-11)
Re: A C++ Parser toolkit (1993-04-12)
Semantic predicates into grammar specifications (1993-04-19)
Re: Semantic predicates into grammar specifications (1993-04-19)
predicate parsing (Ariel Meir Tamches) (1993-04-21)
Re: predicate parsing (1993-04-22)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Terence J Parr)
Keywords: tools, PCCTS
Organization: Compilers Central
References: 93-04-042
Date: Mon, 12 Apr 1993 15:25:55 GMT

I'm very pleased by the posting of Mayan Moudgill
<moudgill@cs.cornell.EDU>; people are beginning to see that semantic
predicates are the way to recognize context-sensitive constructs rather
than having the lexer change the token type (ack!). Mayan writes:

> For instance, the following code:
> int name(Parse& P)
> {
> Token t;
> P, IDENT(t);
> if( P && StbFind(t) ) {
> return 1;
> }
> return 0;
> }
> int stmt(Parse & P)
> {
> Token t;
> P, MATCH(name), "=", NUMBER(val);
> }
> matches an identifier (i.e. [a-zA-Z_][a-zA-Z_0-9]*), '=', number string,
> but only if identifier is already in the symbol-table.

In PCCTS, we would write something akin to:

name : << IsVAR(LATEXT(1)) >>? IDENT

stat : name "=" NUMBER

where <<IsVAR(LATEXT(1))>>? is a semantic predicate; IsVAR is some
user-defined function and LATEXT(1) is the text of the first token of
lookahead. This example behaves exactly as Mayan outlines. We call this
a *validation* semantic predicate (we have syntactic predicates in the
next release of PCCTS). Predicates can also be used to distinguish
between two syntactically ambiguous productions (*disambiguating* semantic
predicates). E.g., let's add a production to stat to match a type name
followed by a declarator.

name : << IsVAR(LATEXT(1)) >>? IDENT

type : << IsTYPE(LATEXT(1)) >>? IDENT

stat : name "=" NUMBER
          | type declarator

In this case, IDENT predicts both productions of stat and k=1 lookahead is
syntactically insufficient. However, ANTLR (the parser-generator of
PCCTS) finds 2 *visible* predicates (one in name and the other in type)
that can be used to semantically disambiguate the productions of stat.
Hence, it *hoists* the predicates for use in the prediction expressions
for stat, thus, resolving the conflict. Note that, using k=2, ANTLR could
uniquely predict stat's productions without predicates and would not hoist
the visible predicates.

PCCTS is in the public domain and may be obtained by sending email to with a blank "Subject:" line.

Terence Parr
Purdue University
School of Electrical Engineering

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.