Fri, 1 Aug 2014 14:58:57 -0700 (PDT)

Related articles |
---|

[3 earlier articles] |

Re: Formatting of Language LRMs gah@ugcs.caltech.edu (glen herrmannsfeldt) (2014-06-21) |

Re: Formatting of Language LRMs kaz@kylheku.com (Kaz Kylheku) (2014-06-21) |

Re: Formatting of Language LRMs Pidgeot18@verizon.net.invalid (=?UTF-8?B?Sm9zaHVhIENyYW5tZXIg8J+Qpw==?=) (2014-06-22) |

Re: Formatting of Language LRMs mertesthomas@gmail.com (2014-06-30) |

Re: Formatting of Language LRMs ivan@ootbcomp.com (Ivan Godard) (2014-07-03) |

Re: Formatting of Language LRMs federation2005@netzero.com (2014-07-28) |

Re: Formatting of Language LRMs federation2005@netzero.com (2014-08-01) |

Re: Formatting of Language LRMs gah@ugcs.caltech.edu (glen herrmannsfeldt) (2014-08-03) |

From: | federation2005@netzero.com |

Newsgroups: | comp.compilers |

Date: | Fri, 1 Aug 2014 14:58:57 -0700 (PDT) |

Organization: | Compilers Central |

References: | 14-06-010 14-07-064 |

Keywords: | syntax, theory |

Posted-Date: | 02 Aug 2014 11:01:16 EDT |

On Monday, July 28, 2014 7:55:12 PM UTC-5, federat...@netzero.com wrote:

*> ... publication of algebraic frameworks for level 2 in the Chomsky*

*> Hierarchy. An algebra was defined in LNCS 4988 ("The Algebraic*

*> Approach I, II")*

A framework for 2nd order formalisms (expressed as Eilenberg-Moore algebras)

that does the analogue to what the Peano Axioms do for arithmetic.

I put some notes that expand on 4988 up on Scribd under

The Algebraic Approach II - V

http://www.scribd.com/doc/235317956/The-Algebraic-Approach-II-V

This is still a work in progress.

*> An equivalent formalism was devised by Kozen a couple years ago*

http://arxiv.org/abs/1309.0893

Actually Grathwohl is the lead author.

*> BNF allows sequences on the RHS of a rule.*

*> EBNF allows regular expressions on the RHS of a rule.*

*> (???) allows full-fledged context-free expressions (CFE) on the RHS.*

*>*

*> There are a couple ways one can specify a CFE. One, which harkens to the*

older

*> formalism (predating 2008) is the "grammar expressions", consisting of a*

*> series of rules followed by an expression (much like GCC's "statement*

*> expression"). (Even in subexpressions).*

The "older formalism" is the one which has been in place since the

1970's developed by Gruska, McWhirter and Yntema. The key elements are

the substitution operator (which I denote here (x = A, B)), and the

least fixed point operator (mu x.A). This may be grafted on top of the

Kleene algebra of regular expressions. The star A* B becomes (mu x.B |

Ax), for instance.

*> The newer notation permits "bra" and "ket" operators in expressions on the*

*> RHS.*

... and is the one I've spoken of here and elsewhere in the past. It is a

direct outgrowth of the Chomsky-Schutzenberger Theorem, and incorporates an

algebra (that would be very familiar to people in DSP who work with

multiresolution analysis)

bd = 1 = pq, bq = 0 = pd, db + qd = 1

and the {b,d,p,q} commute with the {x,y}'s.

So, the Dyck language, D(x,y) = mu S.(1 | xSy | SS) would simply become

b(xp | yq)* d.

Such an expression -- minus the algebra -- comes straight out of the

Chomsky-Schutzenberger Theorem, in which one obtains the language D(x,y) by

(1) expanding the alphabet X = {x,y} to X' = {x,y,b,d,p,q},

(2) devising a regular expression (which I term the "Chomsky-Schutzenberger

Kernel") b (xp | yq)* d over that alphabet

(3) taking the intersection of this with the interleave

(x | y)* interleave D(b,p;d,q)

where the 2-bracket Dyck language is D(b,p;d,q) = mu S.(1 | bSd | pSq | SS)

(4) mapping {b,p,d,q} to the empty word 1.

This can be extended to a more general algbera in which multiple bracket

symbols are permitted b0,b1,b2,... d0,d1,d2,.... The corresponding algebra

then assumes a form reminiscent of the "bra" and "ket" notation used in

Hilbert spaces:

b0, b1, b2, ... become the "bras" <0|, <1|, <2|, ...

d0, d1, d2, ... become the "kets" |0>, |1>, |2>, ...

and the underlying algebra generalizes to one which has the identities

If you are encoding context-free transductions with input alphabet X and

output alphabet Y, the very same framework applies, with the additional

element that the X's and Y's commute. The commutativity rule yx -> xy

corresponds to what in parsing theory is called a "look ahead".

The tricky part about incorporating these extensions is devising a syntax for

an extended BNF notation that allows all these elements to cohabit with one

another.

The result is a 3 level framework that has BNF (plus the fixed point and

substitution operators) at level 1, regular expression operators A*, A?, A+

(i.e. EBNF) at level 2, and the bra-ket algebra at level 3.

Thus, a syntax for D(x,y) (z D(x,y))* could be written as the substitution

expression ("Chomsky-Schuetzenberger" form)

S = <0|(<1|x)*(y|1>)*|0>, S(zS)*

or as the grammar expression ("Gruska-McWhirter-Yntema" form)

S -> SS, S -> xSy, S -> 1, S(zS)*

or in a wide variety of other ways.

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.