4 Jul 2003 00:13:46 -0400

Related articles |
---|

RE: Recursive Descent vs. LALR qjackson@shaw.ca (Quinn Tyler Jackson) (2003-07-04) |

porting of gcc peephole opt rong@capsl.udel.edu (Hongbo Rong) (2003-07-15) |

RE: porting of gcc peephole opt Barak.Zalstein@ParthusCeva.com (Barak Zalstein) (2003-07-17) |

From: | Quinn Tyler Jackson <qjackson@shaw.ca> |

Newsgroups: | comp.compilers |

Date: | 4 Jul 2003 00:13:46 -0400 |

Organization: | Compilers Central |

Keywords: | parse |

Posted-Date: | 04 Jul 2003 00:13:46 EDT |

In-reply-to: | 03-07-024 |

Chris F. Clark said:

*> It is worth noting that the intersection of two cfgs need not be a*

*> cfg, but the intersection of two csgs is a csg. Predicates*

*> (mentioned above) allow one to write grammatically grammars that*

*> are the intersection of context free grammars, which allows*

*> predicated parsers (either LL or LR) to parse context-sensitive*

*> languages.*

That turns out to be a nifty observation in theory, so I thought I

would demonstrate it in practice.

grammar AnBnCn

{

S ::=

($a(X S.x<*>) $b(Y S.y<*>))<a b=anbn> $c(Z<S.x!=S.z> S.z<*>)<b

c=bncn>;

anbn ::= S.x [anbn] S.y;

bncn ::= S.y [bncn] S.z;

X ::= $S.x<-("(" '[a-z][a-z]+' ")");

Y ::= $S.y<-("(" '[a-z][a-z]+' ")");

Z ::= $S.z<-("(" '[a-z][a-z]+' ")");

*}; // grammar AnBnCn*

The above $-grammar accepts strings in the form:

(a)^n(b)^n(c)^n where |a| > 1, |b| > 1, |c| > 1, and a != c

Thus:

(cat)(cat)(dog)(dog)(mouse)(mouse)

but not

(cat)(cat)(dog)(mouse)(mouse)

and not

(cat)(cat)(mouse)(mouse)(cat)(cat)

How it actually goes about its magic is found in the back referencing

and predicates.

Here's an English-ed version of the first production, S:

$a(X S.x<*>))

"Match an X followed by zero to many of whatever X matched, and place

the result into a."

$b(Y S.y<*>))<a b=anbn>

"Match a Y followed by zero to many of whatever Y matched, and place

the result into b. Concatenate a and b, and parse the resulting string

with anbn. Fail if a+b is not accepted by anbn."

$c(Z<S.x!=S.z> S.z<*>)<b c=bncn>

"Match a Z. Fail if whatever matched X is equal to whatever matched Z,

otherwise, match zero to many of whatever matched Z, and place the

final result into c. Concatenate b and c, and parse the resulting

string with bncn. Fail if b+c is not accepted by bncn."

The cumulative result of the predicate intersections and the back

referencing is a grammar that accepts the language described.

*> BTW, adaptive grammars are equivalent in power to TMs.*

Right. Strictly speaking, the notations in the above grammar are

sufficient for Turing Power (give or take a notation), but I leave

that as an exercise for the reader.

--

Quinn Tyler Jackson

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.