3 Jun 2006 18:53:08 -0400

Related articles |
---|

Calculating maximum number of grammar productions ali@hodroj.net (2006-05-15) |

Re: Calculating maximum number of grammar productions owong@castortech.com (Oliver Wong) (2006-06-03) |

From: | "Oliver Wong" <owong@castortech.com> |

Newsgroups: | comp.compilers |

Date: | 3 Jun 2006 18:53:08 -0400 |

Organization: | GlobeTrotter |

References: | 06-05-051 |

Keywords: | parse, theory |

Posted-Date: | 03 Jun 2006 18:53:08 EDT |

<ali@hodroj.net> wrote in message news:06-05-051@comp.compilers...

*> Hello,*

*>*

*> Given a LALR grammar and a set of language tokens. Is there an*

*> algorithm to find out how many productions are possible for a specific*

*> token-wide length statement. For example:*

*>*

*> Consider the following grammar:*

*>*

*> Expr:*

*> Expr + Expr | Expr - Expr | 1 | 0*

*>*

*> for example: a 5 token-wide length production would be one that has 5*

*> tokens in it,*

*> like: 1 + 1 - 0*

*>*

*> For a simple grammar like above, you can come up with a mathematical*

*> formula using factorials or permutations to calculate how many possible*

*> productions you can have, However, what if you have language like C ?*

*> assuming--to make things simple-- for every token type there is only*

*> one associated string (ie: a variable name is always "var", a function*

*> name is always "function"), what would be a suitable algorithm to*

*> calculate the possible number of productions for a specific number of*

*> tokens in a statement?*

It might help to first convert the grammar to chomsky normal form (see

http://en.wikipedia.org/wiki/Chomsky_normal_form)

<quote>

Every grammar in Chomsky normal form is context-free, and conversely, every

context-free grammar can be efficiently transformed into an equivalent one

which is in Chomsky normal form.

[...]

throughout the derivation of a string, each string of terminals and

nonterminals is always either the same length or one element longer than the

previous such string. The derivation of a string of length n is always

exactly 2n-1 steps long. Furthermore, since all rules deriving nonterminals

transform one nonterminal to exactly two nonterminals, a parse tree based on

a grammar in Chomsky normal form is a binary tree, and the height of this

tree is limited to at most the length of the string.

</quote>

So basically, you walk the grammar tree, counting all possible derivations,

always stopping when you reach height n, because you know if you go to a

deeper height, you will end up with a string exceeding length n.

- Oliver

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.