20 Apr 2003 17:37:19 -0400

Related articles |
---|

Grammars with semantics haberg@matematik.su.se (2003-04-20) |

Re: Grammars with semantics rda@lemma-one.com (Rob Arthan) (2003-04-27) |

Re: Grammars with semantics haberg@matematik.su.se (2003-04-27) |

Re: Grammars with semantics haberg@matematik.su.se (2003-05-06) |

From: | haberg@matematik.su.se (Hans Aberg) |

Newsgroups: | comp.compilers |

Date: | 20 Apr 2003 17:37:19 -0400 |

Organization: | Mathematics |

Keywords: | parse, theory, question |

Posted-Date: | 20 Apr 2003 17:37:19 EDT |

Has somebody studied grammars G together with a semantics S in the sense

below? -- Some common grammars are ambiguous, but even with a universal

semantics, they may become unambiguous (meaning, different valid parses

produce the same semantic value).

Take the "calculator" grammar G = (T, N, P, E), terminals T = {i, `+',

`*', `(', `)'}, nonterminals N = {E, T, F}, sentence symbol E, and the

set of productions P containing:

P_1: E -> T

P_2: E -> E `+' T

P_3: T -> F

P_4: T -> T `*' F

P_5: F -> i

P_6: F -> `(' F `)'

The expression i + i*i has several parses, for example the leftmost (as in

LL, top-down, parsing) and rightmost (as in LR, bottom-up parsing):

E E

-> E + T -> E + T

-> T + T -> E + T * F

-> F + T -> E + T * i

-> i + T -> E + F * i

-> i + T * F -> E + i * i

-> i + F * F -> T + i * i

-> i + i * F -> F + i * i

-> i + i * i -> i + i * i

So this grammar us ambiguous, even though when imposing the condition that

the rightmost or leftmost derivation should be used, the ambiguity

disappears: So LL (if removing left recursion) and LR parsers would not

complain.

Now attach a semantics: There is a set of semantic values V, and each production

P_j: a_1 ... a_m -> b_1 ... b_n

is assumed to have an associated function

f_j: V^n -> V^m.

Each string s = s_1 ... s_n, where s_i in T union N, is successively

assigned values in V^n by using these functions f_j: Each terminal is

assumed to have a value. Then, in a derivation Z => s, when a production

P_j is applied, one can get a new value by the use of the function f_j on

the changed components to successively work towards the sentence symbol Z.

It then gets a value in V^1 = V, which is defined to be the semantic value

of the derivation Z => s then. If different derivations Z => s yield the

same semantic values, then this can be used as a semantic value of s.

Now, when I do that for the grammar G above, I get the same semantic value

for the expression i + i * i for the leftmost and the rightmost parses,

namely

(f_2(f_1(f_3(f_5(x_1))), x_2, f_4(f_3(f_5(x_3)), x_4, f_5(x_5)))) in V^1

if the value of the original string is (x_1, x_2, x_3, x_4, x_5) in V^5.

Thus the funny thing happens that even though there are different parses

of i + i*i, the "universal" semantic values agree.

Now, has somebody written this phenomenon down, with conditions

identifying semantically unambiguous grammars? References?

Hans Aberg * Anti-spam: remove "remove." from email address.

* Email: Hans Aberg <remove.haberg@member.ams.org>

* Home Page: <http://www.math.su.se/~haberg/>

* AMS member listing: <http://www.ams.org/cml/>

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.