help: using yacc, and my grammar needs to count items.

madings@baladi.nmrfam.wisc.edu (Steve Mading)
18 Apr 1999 02:16:07 -0400

          From comp.compilers

Related articles
help: using yacc, and my grammar needs to count items. madings@baladi.nmrfam.wisc.edu (1999-04-18)
Re: help: using yacc, and my grammar needs to count items. gneuner@dyn.com (1999-04-19)
Re: help: using yacc, and my grammar needs to count items. debray@CS.Arizona.EDU (1999-04-26)
| List of all articles for this month |

From: madings@baladi.nmrfam.wisc.edu (Steve Mading)
Newsgroups: comp.compilers
Date: 18 Apr 1999 02:16:07 -0400
Organization: University of Wisconsin, Madison
Keywords: parse, question, comment

I am running into something I can't express in BNF, and I can't figure
out how to make lex/yacc parse this correctly. Can anyone offer
advice on this? My problem is that my grammer needs to count items
and group them into 'rows' in a table even though there are no
delimiters.


Here's an example using this language's syntax:


      loop_
              _name1
              _name2
              _name3


              value1-1
              value2-1
              value3-1
              value1-2
              value2-2
              value3-2
              value1-3
              value2-3
              value3-3
      stop_


The idea is that this represents a table of data like so:


            name1 name2 name3
            -----------------------------------
            value1-1 value2-1 value2-1
            value1-2 value2-2 value2-2
            value1-3 value2-3 value2-3


But, as you can see, the problem is that until the program finishes
parsing the list of names, it doesn't know how many values exist in a
row. (a row could be 1 column, or 100, it depends on the size of the
name list). This is not expressable in a BNF-like grammer without
making seperate rules for each possible number of tags like so:


        Loop :== onename onevalrows |
                          twonames twovalrows |
threenames threevalrows |
... etc ...
        onename :== name
        onevalrows :== value | onevalRows value
        twonames :== name name
        twovalrows :== value value | twovalRows value value
        ...etc...


This is obviously not a practical solution when the list of names can be
any large size.


What I need to do is to be able to say "A foo consists of N bars, where
N is to be determined at runtime."


Does anyone know how to solve this?


(Before you suggest it, no I can't change the grammar - I'm working off
a syntax spec created by biochemists that has become a de-facto standard,
unfortunately.)


What I did previously to solve the problem was to simply read the
whole list of values, and then break them up into rows later on at the
end in C code. But this is becoming impractical as the loops get
bigger and bigger. (Yacc is eating up too much memory trying to parse
such a huge list without reaching the end of it.)


[I think your original solution is still the best. If yacc is eating up
too much space, that generally means that you have right-recursive rather
than left-recursive rules. Left recursive rules can loop indefinitely
without the parser using any more space. Another solution is to use a
Gross Lexical Hack, e.g.:


loop: names rows ;


names: NAME | names NAME ;


rows: row | rows row ;


row: values EOR ;


values: VALUE | values VALUE ;


Now each row is parsed as a "row" rule, nice and tidy. But, you ask,
where does the EOR come from? From the lexer, of course. Once you've
parsed the names, you can count them and tell that N values go in a
row. You have your lexer count the VALUEs it returns and after every
N values, return an EOR. This is disgustingly ad-hoc but will
probably add only six lines of code to your lexer. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.