Re: Grammar -> Parser question

"Torbjorn Drevin" <td@sysinno.se>
9 Jun 1998 11:58:55 -0400

          From comp.compilers

Related articles
Grammar -> Parser question chris63@my-dejanews.com (1998-06-04)
Re: Grammar -> Parser question td@sysinno.se (Torbjorn Drevin) (1998-06-09)
Re: Grammar -> Parser question torbenm@diku.dk (Torben Mogensen) (1998-06-09)
Re: Grammar -> Parser question dwight@pentasoft.com (1998-06-09)
Re: Grammar -> Parser question qjackson@wave.home.com (Quinn Tyler Jackson) (1998-06-09)
Re: Grammar -> Parser question qjackson@wave.home.com (Quinn Tyler Jackson) (1998-06-18)
| List of all articles for this month |

From: "Torbjorn Drevin" <td@sysinno.se>
Newsgroups: comp.compilers
Date: 9 Jun 1998 11:58:55 -0400
Organization: Algonet/Tninet
References: 98-06-018
Keywords: parse, LL(1)

Hi,


Maybe this example grammer can help you:


expr ::= <term> [<addop> <term>]*
term ::= <factor> [ <mulop> <factor> ]*
factor ::= <literal> | (<expr>) | <variable>


It may not suite your purpose, but have a look at the grammer for factor.
The factor can be a <expr> in LPARENT and RPARENT. This is where the
recursive thing begins. Now have a look at a grammer for condition:


cond ::= expr relop expr
relop ::= for example: != | >= | <= | == ....


A common way to handcode the parser is to make functions that correspond to
the BNF grammer. A function is done from each grammer, something like:


// Parse > cond ::= expr relop expr
void do_cond()
{
        do_expr();
        Match(relop);
        do_expr();
}


Note that it make calls to the expression function.


// Parse <expr> ::= <term> [<addop> <term>]*
void do_expr()
{
        do_term();
        while ( isaddop(get_token()) )
                do_term();
}


Do you get the point ?
The do_term function then make a call to do_factor, and do_factor make a
call back to do_term(), if LPARENT is found. This can now handle expressions
like:


        (((((((a*124)*12)*(1243 .....


It does not matter how complex the expression may look like.


One can need some "look ahead" sometimes. You should make a function
something like: char* get_token()


The function get_token() always have the next token stored.


If the source to parse look like:


A B


A call to get_token() return "A". But, the smart get_token() have also
stored the next token at char* ahead. if one look at whats in ahead, it
should be "B". This is a solution for this.


What you need, is to have a look at Jack Crenshaw's "Lets build a compiler".
He deals exactly with problems like this, and he explain it very good.


/Torbjorn Drevin doc.drevin@swipnet.se / td@sysinno.se




chris63@my-dejanews.com wrote...
>Suppose I have the following Grammar:
>
>condition -> expr PLUS expr |
> LPAREN condition RPAREN
>
>expr -> ID |
> LPAREN expr RPAREN
>
>I want to hand-code a top down recursive descent parser for this grammar
>using ideas from the Dragon Book (nonterminals become functions, terminals
>become calls to match()). I can't use lex or yacc.
>
>However, the LPAREN symbol is causing a problem: When I see it, I don't
>know if I'm dealing with a nested condition or a nested expression.
>
>How does one typically deal with this situation if one has to hand-code
>the parser? It looks like I need some sort of lookahead and/or backtracking
>mechanism so I can figure out which case I am dealing with.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.