Re: Parsing Expression Grammar

Sylvain Schmitz <schmitz@i3s.unice.fr>
22 Sep 2005 23:47:16 -0400

          From comp.compilers

Related articles
[16 earlier articles]
Re: Parsing Expression Grammar Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-09-14)
Re: Parsing Expression Grammar cleos@nb.sympatico.ca (Cleo Saulnier) (2005-09-17)
Re: Parsing Expression Grammar DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-09-17)
Re: Parsing Expression Grammar Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-09-18)
Re: Parsing Expression Grammar cleos@nb.sympatico.ca (Cleo Saulnier) (2005-09-22)
Re: Parsing Expression Grammar schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-09-22)
Re: Parsing Expression Grammar schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-09-22)
Re: Parsing Expression Grammar rsc@swtch.com (Russ Cox) (2005-09-22)
Re: Parsing Expression Grammar cfc@shell01.TheWorld.com (Chris F Clark) (2005-09-22)
Re: Parsing Expression Grammar schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-09-22)
Re: Parsing Expression Grammar paul@parsetec.com (Paul Mann) (2005-09-22)
Re: Parsing Expression Grammar cfc@shell01.TheWorld.com (Chris F Clark) (2005-09-23)
Re: Parsing Expression Grammar gneuner2@comcast.net (George Neuner) (2005-09-25)
[5 later articles]
| List of all articles for this month |

From: Sylvain Schmitz <schmitz@i3s.unice.fr>
Newsgroups: comp.compilers
Date: 22 Sep 2005 23:47:16 -0400
Organization: Compilers Central
References: 05-08-115 05-09-008
Keywords: parse
Posted-Date: 22 Sep 2005 23:47:16 EDT

All this thread on PEGs has convinced me of giving them a better look.
But something isn't quite clear; in his ICFP'03 paper, Bryan Ford
presented an unambiguous CFG that could not be recognized by his packrat
parser:


                                                    P -> x P x | x


(this CFG is especially interesting for it has a quadratic parsing time
complexity using an Earley parser).
      I have tried "P <- x P x / x" to see what would happen, and indeed,
the packrat parser is only able to parse sequences of 2^n - 1 "x"
symbols, n > 0.


This problem does not seem to be addressed in his POPL'04 paper; the
well-formed definition, if I understand it correctly, is only concerned
about left recursion and does not reject the above grammar.
      Now, strictly speaking, the language recognized by "P <- x P x / x"
*is* {x^(2^n - 1) | n > 0} (and that's quite an achievement in itself),
but I find it a bit misleading due to the CFG generative habits.
Looking at this, I also wonder whether there exist a PEG for the CFL
{ww^R | w \in {a,b}^*}.


All this to get to the conclusion that the immediate translation of a
CFG into a PEG by replacing "|" with "/" will not necesseraly yield the
expected result. Even the simple grammar "S -> Ac, A -> a| ab" should
not be translated "S <- Ac, A <- a / ab" but "S <- Ac, A <- ab / a",
otherwise input "abc" would not be accepted.


Russ Cox wrote:
  > One thing PEGs definitely do well (and this is at least one reason why
  > Bryan worked on them) is that they are composable: you can take any
  > terminal symbol in a PEG, replace it with a non-terminal with
  > its own PEG grammar, and you get a bigger, but still valid, PEG grammar.
This is definitely an excellent property.


--
Regards,


      Sylvain


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.