Sat, 30 Jan 2010 17:19:27 +0000

Related articles |
---|

Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2009-02-06) |

Re: Parsing implicit operators with recursive descent? mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2009-02-07) |

Re: Parsing implicit operators with recursive descent? armelasselin@hotmail.com (Armel) (2009-02-07) |

Re: Parsing implicit operators with recursive descent? torbenm@pc-003.diku.dk (2009-02-09) |

Re: Parsing implicit operators with recursive descent? barry.j.kelly@gmail.com (Barry Kelly) (2009-02-12) |

Re: Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2010-01-30) |

Re: Parsing implicit operators with recursive descent? kkylheku@gmail.com (Kaz Kylheku) (2010-02-01) |

From: | "Johann 'Myrkraverk' Oskarsson" <johann@myrkraverk.com> |

Newsgroups: | comp.compilers |

Date: | Sat, 30 Jan 2010 17:19:27 +0000 |

Organization: | Compilers Central |

References: | 09-02-013 09-02-022 |

Keywords: | parse |

Posted-Date: | 31 Jan 2010 01:13:29 EST |

A year ago, torbenm@pc-003.diku.dk (Torben Fgidius Mogensen) wrote:

*> "Johann 'Myrkraverk' Oskarsson" <johann@myrkraverk.com> writes:*

*>*

*>> Is it possible to parse implicit operators, like the regular*

*>> expression concatenation operator, with a recursive descent parser?*

*>*

*> Yes, you can do it mostly by standard grammar-rewriting techniques.*

*>*

*> Let us take regular expressions as an example. We start with the*

*> grammar:*

*>*

*>*

*> R -> R | R*

*> R -> R R*

*> R -> R **

*> R -> a*

*>*

*> Note that the | above is a terminal. a stands for any alphabet*

*> symbol.*

*>*

*> We start by eliminating ambiguity. We choose (as normal) | to bind*

*> weaker than concatenation which binds weaker than Kleene-star. We*

*> also choose | and concatenation to be right-associative. They are*

*> semantically fully associative, so we could choose any. But we get*

*> less left-recursion to eliminate later by choosing them to be*

*> right-assciative.*

*>*

*> R -> C | R*

*> R -> C*

*> C -> S C*

*> C -> S*

*> S -> S **

*> S -> a*

*>*

*> We now do left factorisation and left-recursion elimination:*

*>*

*> R -> C R'*

*> R' -> | R*

*> R' ->*

*> C -> S C'*

*> C' -> C*

*> C' ->*

*> S -> a S'*

*> S' -> * S'*

*> S' ->*

*>*

*> To see if this is LL(1), we need to calculate FIRST and FOLLOW.*

*> Below, I have for each production added the FIRST set:*

*>*

*>*

*> R -> C R' {a}*

*> R' -> | R {|}*

*> R' -> {}*

*> C -> S C' {a}*

*> C' -> C {a}*

*> C' -> {}*

*> S -> a S' {a}*

*> S' -> * S' {*}*

*> S' -> {}*

*>*

*> The FOLLOW sets are:*

*>*

*> FOLLOW(R) = {$}*

*> FOLLOW(R') = {$}*

*> FOLLOW(C) = {$,|}*

*> FOLLOW(C') = {$,|}*

*> FOLLOW(S) = {$,|,a}*

*> FOLLOW(S') = {$,|,a}*

*>*

*> This leads to the following parse table:*

*>*

*> | a * | $*

*> --+------------------------------*

*> R | R->CR'*

*> R'| R'->|R R'->*

*> C | C->SC'*

*> C'| C'->C C'-> C'->*

*> S | S->aS'*

*> S'| S'-> S'->*S' S'-> S'->*

*>*

*> which has no conflicts, so the grammar is LL(1).*

*>*

*> The parse table can easily be rewritten to a recursive descent*

*> parser.*

It has been a year since I was working on this toy project, and now

that I'm digging it up again, I realize I don't understand the parse

table above.

In particular, what does "S'-> " mean? What does it mean when the

parse table entry is empty? Like when R' hits "a".

Perhaps it's been too long since I was reading the dragon book. I

have the 1988 version, according to the copyright section.

Johann -- in hope for clarifications.

P.S. And thanks everyone who responded to my original post, though you

may not remember.

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.