Re: regular expression operators in CF grammars

"tj bandrowsky" <tbandrow@unitedsoftworks.com>
3 Sep 2002 00:28:00 -0400

          From comp.compilers

Related articles
[16 earlier articles]
Re: regular expression operators in CF grammars rboland@unb.ca (Ralph Boland) (2002-07-04)
Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-07-15)
Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-07-21)
Re: regular expression operators in CF grammars nospam@snafu.de (Sönke Kannapinn) (2002-08-10)
Re: regular expression operators in CF grammars nospam@snafu.de (Sönke Kannapinn) (2002-08-10)
Re: regular expression operators in CF grammars cfc@shell01.TheWorld.com (Chris F Clark) (2002-08-23)
Re: regular expression operators in CF grammars tbandrow@unitedsoftworks.com (tj bandrowsky) (2002-09-03)
Re: regular expression operators in CF grammars cfc@shell01.TheWorld.com (Chris F Clark) (2002-09-12)
Re: regular expression operators in CF grammars cfc@TheWorld.com (Chris F Clark) (2002-09-14)
RE: regular expression operators in CF grammars qjackson@shaw.ca (Quinn Tyler Jackson) (2002-09-14)
| List of all articles for this month |

From: "tj bandrowsky" <tbandrow@unitedsoftworks.com>
Newsgroups: comp.compilers
Date: 3 Sep 2002 00:28:00 -0400
Organization: http://groups.google.com/
References: 02-05-142 02-05-151 02-06-024 02-06-056 02-06-090 02-07-010 02-07-025 02-07-043 02-08-035 02-08-078
Keywords: parse
Posted-Date: 03 Sep 2002 00:28:00 EDT

Chris,


In previous posting, you wrote...


>obj_id: id | obj_id "." id; // a recursive definition of id ("." id)*
>expr: obj_id
> | simple_func
> | obj_func
> | expr "+" expr
> ...
> ;
>simple_func: id "(" asgn_list ")"; // simple funcs have complex
params


>asgn_list: (obj_id "=" expr ";")+;


>obj_func: obj_id "(" expr_list? ")"; // obj funcs have simple params


>expr_list: expr ("," expr)*;
>The problem is this grammar isn't LR(k) for any k. Why not?
Because,
>one needs unbounded (aka infinite) lookahead to distinguish between
>the id of a simple_func and the obj_id of an obj_func.


So I took the liberty of trying to put that grammar into my diplodicus
parser as a test to see what would happen:


class chris_type : public shift_reduce_parser_type {
public:


chris_type()
{
trace_reduce = true;
trace_shift = true;


add_lexical_rule( 1, "id", "[/a][/a/d_]+" );
add_lexical_rule( 2, ".", "." );
add_lexical_rule( 3, "+", "+" );
add_lexical_rule( 4, ")", ")" );
add_lexical_rule( 5, "(", "(" );
add_lexical_rule( 6, "=", "=" );
add_lexical_rule( 7, ";", ";" );
add_lexical_rule( 8, ",", "," );
add_lexical_rule( 9, "ws", "/w" );
add_lexical_rule( 10, "number", "[/d]+" );


add_grammatical_rule( 10, "obj_id", "id" );
add_grammatical_rule( 11, "obj_id", "obj_id . id" );
add_grammatical_rule( 12, "expr", "obj_id" );
add_grammatical_rule( 13, "expr", "simple_func" );
add_grammatical_rule( 14, "expr", "expr + expr" );
add_grammatical_rule( 15, "expr", "number" );


add_grammatical_rule( 16, "simple_func", "id ( asgn_list )" );
add_grammatical_rule( 17, "asgn_list", "obj_id = expr ;" );
add_grammatical_rule( 18, "asgn_list", "asgn_list asgn_list" );
add_grammatical_rule( 19, "obj_func", "obj_id ( expr_list )" );
add_grammatical_rule( 20, "expr_list", "expr" );
add_grammatical_rule( 21, "expr_list", "expr_list , expr" );


}


};


void test_parser10()
{
chris_type chris;


chris.start();


char *program =
"window.title=42;move(name=47;weight=widget.something;)";


chris.parse(
program, strlen(program)
);


chris.finish();


}


Is the above the right intent of what you were trying to do?


If so, I was able to parse it with diplodicus. The trace for the
parse is below. Diplodicus tries to solve this sort of ambiguity
problem by never reducing until there is one and only one unambiguous
reduction to take. It keeps track of all the rules that are possibly
ambiguous as each token is shifted. The trick I used to make this
work is sensing the condition where you go from not reducing a
non-ambiguous rule because another rule is ambiguous with to have no
reductions to make at all, in which case you have to have backtrack at
most one token, take the unambiguous reduction you previously blew
off, and then proceed.


shift 'window'->id
type stack: id
value stack: window


shift '.'->.
type stack: id,.
value stack: window,.
rule: 10) obj_id [ stack:id ]
(was reduced by step back)
type stack: obj_id,.
value stack: $end_of_file,.


shift 'title'->id
type stack: obj_id,.,id
value stack: $end_of_file,.,title
rule: 11) obj_id [ stack:obj_id,.,id ]
type stack: obj_id
value stack: $end_of_file
type stack: obj_id
value stack: $end_of_file


shift '='->=
type stack: obj_id,=
value stack: $end_of_file,=


shift '42'->number
type stack: obj_id,=,number
value stack: $end_of_file,=,42
rule: 15) expr [ stack:obj_id,=,number ]
type stack: obj_id,=,expr
value stack: $end_of_file,=,$end_of_file
type stack: obj_id,=,expr
value stack: $end_of_file,=,$end_of_file


shift ';'->;
type stack: obj_id,=,expr,;
value stack: $end_of_file,=,$end_of_file,;
rule: 17) asgn_list [ stack:obj_id,=,expr,; ]
type stack: asgn_list
value stack: $end_of_file
type stack: asgn_list
value stack: $end_of_file


shift 'move'->id
type stack: asgn_list,id
value stack: $end_of_file,move


shift '('->(
type stack: asgn_list,id,(
value stack: $end_of_file,move,(


shift 'name'->id
type stack: asgn_list,id,(,id
value stack: $end_of_file,move,(,name


shift '='->=
type stack: asgn_list,id,(,id,=
value stack: $end_of_file,move,(,name,=
rule: 10) obj_id [ stack:asgn_list,id,(,id ]
(was reduced by step back)
type stack: asgn_list,id,(,obj_id,=
value stack: $end_of_file,move,(,$end_of_file,=


shift '47'->number
type stack: asgn_list,id,(,obj_id,=,number
value stack: $end_of_file,move,(,$end_of_file,=,47
rule: 15) expr [ stack:asgn_list,id,(,obj_id,=,number ]
type stack: asgn_list,id,(,obj_id,=,expr
value stack: $end_of_file,move,(,$end_of_file,=,$end_of_file


shift ';'->;
type stack: asgn_list,id,(,obj_id,=,expr,;
value stack: $end_of_file,move,(,$end_of_file,=,$end_of_file,;
rule: 17) asgn_list [ stack:asgn_list,id,(,obj_id,=,expr,; ]
type stack: asgn_list,id,(,asgn_list
value stack: $end_of_file,move,(,$end_of_file


shift 'weight'->id
type stack: asgn_list,id,(,asgn_list,id
value stack: $end_of_file,move,(,$end_of_file,weight


shift '='->=
type stack: asgn_list,id,(,asgn_list,id,=
value stack: $end_of_file,move,(,$end_of_file,weight,=
rule: 10) obj_id [ stack:asgn_list,id,(,asgn_list,id ]
(was reduced by step back)
type stack: asgn_list,id,(,asgn_list,obj_id,=
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=


shift 'widget'->id
type stack: asgn_list,id,(,asgn_list,obj_id,=,id
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,widget


shift '.'->.
type stack: asgn_list,id,(,asgn_list,obj_id,=,id,.
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,widget,.
rule: 10) obj_id [ stack:asgn_list,id,(,asgn_list,obj_id,=,id ]
(was reduced by step back)
type stack: asgn_list,id,(,asgn_list,obj_id,=,obj_id,.
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,$end_of_file,.


shift 'something'->id
type stack: asgn_list,id,(,asgn_list,obj_id,=,obj_id,.,id
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,$end_of_file,.,something
rule: 11) obj_id [ stack:asgn_list,id,(,asgn_list,obj_id,=,obj_id,.,id
]
type stack: asgn_list,id,(,asgn_list,obj_id,=,obj_id
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,$end_of_file


shift ';'->;
type stack: asgn_list,id,(,asgn_list,obj_id,=,obj_id,;
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,$end_of_file,;
rule: 12) expr [ stack:asgn_list,id,(,asgn_list,obj_id,=,obj_id ]
(was reduced by step back)
type stack: asgn_list,id,(,asgn_list,obj_id,=,expr,;
value stack: $end_of_file,move,(,$end_of_file,$end_of_file,=,$end_of_file,;
rule: 17) asgn_list [ stack:asgn_list,id,(,asgn_list,obj_id,=,expr,; ]
type stack: asgn_list,id,(,asgn_list,asgn_list
value stack: $end_of_file,move,(,$end_of_file,$end_of_file
rule: 18) asgn_list [ stack:asgn_list,id,(,asgn_list,asgn_list ]
type stack: asgn_list,id,(,asgn_list
value stack: $end_of_file,move,(,$end_of_file


shift ')'->)
type stack: asgn_list,id,(,asgn_list,)
value stack: $end_of_file,move,(,$end_of_file,)
rule: 16) simple_func [ stack:asgn_list,id,(,asgn_list,) ]
type stack: asgn_list,simple_func
value stack: $end_of_file,$end_of_file
rule: 13) expr [ stack:asgn_list,simple_func ]
type stack: asgn_list,expr
value stack: $end_of_file,$end_of_file
rule: 20) expr_list [ stack:asgn_list,expr ]
type stack: asgn_list,expr_list
value stack: $end_of_file,$end_of_file


shift ''->$end_of_file
type stack: asgn_list,expr_list,$end_of_file
value stack: $end_of_file,$end_of_file,


Unexpected end of file. Parse Stack:
|asgn_list,expr_list,$end_of_file| line 1,
  character 55


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.