25 Jan 1997 21:59:35 -0500

Related articles |
---|

[3 earlier articles] |

Re: Recursive descent and left recursion dlmoore@ix.netcom.com (David L Moore) (1997-01-16) |

Re: Recursive descent and left recursion cfc@world.std.com (1997-01-16) |

Re: Recursive descent and left recursion fjh@murlibobo.cs.mu.OZ.AU (1997-01-16) |

Re: Recursive descent and left recursion cfc@world.std.com (1997-01-17) |

Re: Recursive descent and left recursion will@ccs.neu.edu (William D Clinger) (1997-01-17) |

Re: Recursive descent and left recursion cfc@world.std.com (Chris F Clark) (1997-01-21) |

Re: Recursive descent and left recursion schoebel@eicheinformatik.uni-stuttgart.de (1997-01-25) |

From: | schoebel@eicheinformatik.uni-stuttgart.de (Thomas Schoebel-Theuer) |

Newsgroups: | comp.compilers |

Date: | 25 Jan 1997 21:59:35 -0500 |

Organization: | Informatik, Uni Stuttgart, Germany |

References: | 97-01-099 97-01-119 |

Keywords: | parse |

mfinney@lynchburg.net wrote:

|> However, I have been using recursive descent with left recursive

|> grammers for more than a decade. All it takes is the trivially

|> obvious check to allow the left recursion. Take, for example...

|>

|> (1) <exp> := <exp> + <term>

|> (2) <exp> := <term>

|>

|> when expanding (1), at the first term I simply check to see if the

|> current token in the input stream is the same as the last time that

|> the rewriting rule was expanded. If it is, then the parse has not

|> advanced and you have the infinite loop situation. I simply fail the

|> expansion and select an alternate rewriting rule for expansion.

Aehem, I think there is an error here: You try to get a

*deterministic* parser, i.e. a device avoiding backtracking from dead

ends, if I understood right. Your method of parse-time left recursion

elimination must not alter the accepted language, right? If you simply

fail one possible path, you cannot follow this path again if it should

be necessary later, by definiton of the "deterministicness". I see the

following problem: For example, the input <term> + <term> + <term>

cannot be parsed, because you have to choose the right number of

expansions of rule (1) *in advance* (i.e. at the first <term>) without

having seen the next tokens. At this point, you cannot know the number

of "+" that will appear later (if you don't have infinite lookahead).

A possible solution to this problem is what left corner parsers do

very regularly: you introduce a new instance of rule (1) *later* when

some additional "+" appears. This has a similar effect as transforming

the grammar to

(1') <exp> := <term> <expr'>

(2') <expr'> := + <expr>

(3') <expr'> := \epsilon

but continues to produce the *original* parse tree, i.e. you insert

the new instance of (1) "in the middle" of the already built parse

tree, und thus get the original left-recursive tree shape in contrast

to the right-recursive one of the modified grammar.

However, when you measure what is going on in terms of your original

grammar, the parse tree generation is either no longer top-down or no

longer deterministic, since you insert new rules "in the middle",

i.e. you *change* connections of your syntax tree later if it becomes

necessary. If you add inherited attributes, this may become a problem.

David L Moore <dlmoore@ix.netcom.com> writes:

|> At the risk of appearing thick, I do not see how this works. Surely,

|> given an expression like "1+2+3+4+5" I have to "pump" out four exp +

|> terms before I shift the first symbol (the 1) of the expression, so

|> the method suggested will not parse this expression.

Exactly, this is the problem.

|> What am I failing to grasp?

Nothing. Cycle detection alone is not sufficient. Leaving the normal

top-down depth-first evaluation order is only correct if all possible

paths can be (potentially) explored at all. So you can *rearrange*

(sort) the priority of evaluations in order to prevent infinite loops,

but you must not cut away any possible paths. However, changing the

order of evaluation may conflict with the LL(k) property: the

*deterministic parsing decision* must occur at the predictor step

(i.e. at the beginning of any rule), but you cannot decide the right

number of correct alternatives in your above example at this first

position.

So you cannot call a parser handling left-recursiveness LL(k) any

more. You can parse it deterministically (e.g. using LR(k)), but you

have to delay some parsing decicions after the predictor step. Note

that LR(k) is nothing but delaying the parsing decision until the

*end* of each rule, and exploring all possible paths occuring in the

meantime in parallel.

See my paper

@misc{Sch94,

author = {Thomas Sch\"obel-Theuer},

title = {Towards a Unified Theory of Context-Free Parsing},

howpublished = {presentation in: 1st ASMICS Workshop on Parsing Theory,

Dipartimento di Scienze dell'Informazione, Universita' degli Studi di Milano},

month = {13 October},

year = {1994}

*}*

@proceedings{ASMICS94,

editor = {Giovanni Pighizzini and Pierluigi San

Pietro},

title = {ASMICS Workshop on Parsing Theory},

organization = {ASMICS94, Rapporto Interno No 126-94, Dipartimento

di Scienze dell' Informazione, Universita' degli Studi di Milano},

month = {12--14 October},

year = {1994}

*}*

Greetings,

-- Thomas

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.