Re: EBNF grammar for tabular format

Ray <bear@sonic.net>
Mon, 27 Jul 2009 09:08:24 -0700

          From comp.compilers

Related articles
EBNF grammar for tabular format pedro.kroger@gmail.com (=?ISO-8859-1?Q?Pedro_Kr=F6ger?=) (2009-07-13)
Re: EBNF grammar for tabular format bear@sonic.net (Ray) (2009-07-27)
| List of all articles for this month |

From: Ray <bear@sonic.net>
Newsgroups: comp.compilers
Date: Mon, 27 Jul 2009 09:08:24 -0700
Organization: Doubtful
References: 09-07-030
Keywords: EBNF, parse
Posted-Date: 29 Jul 2009 08:37:14 EDT

Pedro KrC6ger wrote:


> Hi,
>
> There's a program called humdrum that has a tabular format like:
>
> **foo **bar
> a b
> a b
> a b
>
> where **foo indicates the type of data on the 1st column and **bar on
> the 2nd. The format can have an infinite number of columns and many
> different types of data (i.e. many **<name> things).
> I'd like to know if I can describe this format using an EBNF grammar.
>
> My impression is that I can't describe this kind of thing with an EBNF
> grammar, is my assumption correct?


There's not enough information to answer the question, because we don't
know how "Extended" your "Backus-Naur Form" actually is.


Above, you describe a context-sensitive language.


BNF grammars, and the ordinary extensions to them, describe context-free
languages and you can't use them to write a grammar for it.


However, if your EBNF is so extended as to allow productions like
aBc -> abc


with B a nonterminal symbol, a and b arbitrary strings of terminals
and nonterminals, and c an arbitrary nonempty string of terminals and
nonterminals, then you can write a grammar in your particular EBNF
that completely describes the above language.


This extension to BNF is uncommon in software tools for formal
languages, because it doesn't seem to have much value; it is hard
to use it to write an efficient parser. You can find it sometimes
in tools intended for use in the field of natural (human) languages.


In practice, your problem is still easy to solve.


You can make an EBNF grammar that will accept any sentence of the
above language. This will not be a complete description of the language
because it will also accept sentences which do not have the language's
structure.


Then you can annotate rules of that production grammar with actions
that will modify the state of the parser, or check the parser's state,
and use logic outside the grammar itself to give the data structure or
allow or disallow particular productions.


In your particular case, you'd annotate the rule that reads the first
entity which isn't a column heading with an action that records in the
parser state how many columns there are, allocates space for the first
row, and initializes a counter. Then the rule to accept each new item
would be annotated with an action that checks the counter against the
number of columns, and if there's still room, increments the counter
and stores the item in the current row. If there's not room, it would
allocate a new row, reinitialize the counter, and store the object in
the new row.


A similar approach, of adding handwritten code to an EBNF grammar,
is a good solution for most of the context-sensitive languages that
people are actually interested in. This may be another reason why
it is uncommon to find a software tool that works with context-
sensitive grammars.


                                                                Bear



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.