ASP-style grammar

o_dwyer_john@hotmail.com (John O'Dwyer)
30 Mar 2003 00:44:49 -0500

          From comp.compilers

Related articles
ASP-style grammar o_dwyer_john@hotmail.com (2003-03-30)
Re: ASP-style grammar 6667@wp.pl (kat-Zygfryd) (2003-03-30)
| List of all articles for this month |

From: o_dwyer_john@hotmail.com (John O'Dwyer)
Newsgroups: comp.compilers
Date: 30 Mar 2003 00:44:49 -0500
Organization: http://groups.google.com/
Keywords: parse, question, comment
Posted-Date: 30 Mar 2003 00:44:49 EST

I'm having difficulties building an ASP-style grammar to parse
something like this:


some boilerplate markup some boilerplate markup
some boilerplate markup some boilerplate markup
<% script code goes here %>
some boilerplate markup some boilerplate markup
some boilerplate markup some boilerplate markup


I want to use production rules:


file -> contents
contents -> content
contents -> contents, content
content -> boilerplate_markup
content -> script_block
script_block -> open_tag, script_expressions, close_tag


And regular expressions:


'(.|\n)*' = boilerplate_markup // any char including CR
'<%' = open_tag
'%>' = close_tag


Problem is, the lexer classifies the whole input as boilerplate_markup
rather than reducing at the open_tag (which then pushes a new lexer
for the script expressions).


I can get it to work if I reclassify the boilerplate as a single char:


'(.|\n)' = boilerplate_markup


But this creates an unmanageably large parse tree.


I guess I need to rephrase the regular expressions, but I've tried
everything I can think of without success.


Any help would be gratefully received!


Many thanks in advance,


John.
[Most lexers read the largest chunk they can, so your original pattern
slurps right across the <% marker. Try something like this to force it
to stop and look at the <
'(([^<]|\n)*|<)' = boilerplate_markup // any char including CR
-John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.