Re: Grammar for optional elements (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Thu, 14 Jun 2007 11:32:26 +0200

          From comp.compilers

Related articles
Grammar for optional elements (Mohitz) (2007-06-12)
Re: Grammar for optional elements (2007-06-14)
Re: Grammar for optional elements (2007-06-16)
Re: Grammar for optional elements (Robert A Duff) (2007-06-16)
Re: Grammar for optional elements (Chris F Clark) (2007-06-16)
Re: Grammar for optional elements (Karsten Nyblad) (2007-06-17)
Re: Grammar for optional elements (Tony Finch) (2007-06-17)
Re: Grammar for optional elements (2007-06-18)
[9 later articles]
| List of all articles for this month |

From: (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Newsgroups: comp.compilers
Date: Thu, 14 Jun 2007 11:32:26 +0200
Organization: Department of Computer Science, University of Copenhagen
References: 07-06-019
Keywords: parse, design
Posted-Date: 16 Jun 2007 10:19:01 EDT

Mohitz <> writes:

> I need to create a parser for a language something like this.
> attribute1: value;
> attribute2: value;
> attribute3: value;
> All the attributes are optional but can occur only once...
> One way is to specify the grammar as follows: This doesn't do the
> 'SINGLE occurence check'
> attribute_list : attribute_list attribute | attribute

The above grammar does not allow the empty list of attributes.

> Can a grammar be designed which ensures that each attribute can be
> entered only once.
> [You can do it but you'll have a huge ugly grammar. You're much
> better off with a grammar that accepts any list of attributes, and
> reject duplicates in your semantic code. -John]

I agree with John, as specifiying a sequence of N different items that
can appear in any order and where each can occur at most once takes a
grammar of size > 2^N, as you essentially have to remember (in the
grammar rules) which options you have already seen.

If the order is fixed, i.e., if attribute1 must be before attribute2
(if both appear) and so on, it takes only a grammar of size N:

attribute_list : attribute1_option attribute2_option attribute3_option

attribute1_option : ATTRIBUTE1 ':' IDENTIFIER | \epsilon

attribute2_option : ATTRIBUTE2 ':' IDENTIFIER | \epsilon

attribute3_option : ATTRIBUTE3 ':' IDENTIFIER | \epsilon

where the last three productions specify that each attribute might be
empty (often, the epsilon is omitted, so you would have only
whitespace after the bar).

Even though the size of the grammar is > 2^N when the order is free,
this is not too bad if N=3 (2^3 = 8). You have a nonterminal for each
possible set of already-seen options:

attribute_list : a1 al_1 | a2 al_2 | a3 al_3 | \epsilon

al_1 : a2 al_12 | a3 al_13 | \epsilon

al_2 : a1 al_12 | a3 al_23 | \epsilon

al_3 : a1 al_13 | a2 al_23 | \epsilon

al_12: a3 | \epsilon

al_13: a2 | \epsilon

al_23: a1 | \epsilon




At three attributes it may be all right to check for duplictaes in the
grammar rules. But it is, as John said, going to be very ugly if you
have more than three attributes. So if you may later have to add
extra attributes, it is much easier to implement the extension if you
check for duplicates in the semantic actions.

[I think the size of the grammar is only N! but that's still way too big. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.