Re: Alternative Syntax for Regular Expressions?

eanders@cs.berkeley.edu (Eric Arnold Anderson)
16 Oct 2001 00:09:39 -0400

          From comp.compilers

Related articles
[3 earlier articles]
Re: Alternative Syntax for Regular Expressions? jenglish@flightlab.com (2001-10-12)
Re: Alternative Syntax for Regular Expressions? vannoord@let.rug.nl (2001-10-12)
Re: Alternative Syntax for Regular Expressions? dmitry@elros.cbb-automation.de (2001-10-12)
Re: Alternative Syntax for Regular Expressions? alexc@world.std.com (2001-10-13)
Re: Alternative Syntax for Regular Expressions? rboland@unb.ca (Ralph Boland) (2001-10-13)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-14)
Re: Alternative Syntax for Regular Expressions? eanders@cs.berkeley.edu (2001-10-16)
Re: Alternative Syntax for Regular Expressions? ralph@inputplus.demon.co.uk (2001-10-16)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-20)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-20)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-20)
| List of all articles for this month |

From: eanders@cs.berkeley.edu (Eric Arnold Anderson)
Newsgroups: comp.compilers
Date: 16 Oct 2001 00:09:39 -0400
Organization: University of California, Berkeley
References: 01-10-029 01-10-072
Keywords: lex
Posted-Date: 16 Oct 2001 00:09:39 EDT

> [ Edward Nilges complains that RE's are unreadable and proposes the use
> of BNF.]


Your example was:


^(\([0-9]{3}\)[ ]{1}){0,1}[0-9]{3}\-[0-9]{4}$ Yecchhhh


If we were to translate this to Perl RE's directly, we would get (1):


$regex = qr/^(\(\d{3}\) )?\d{3}-\d{4}$/;


or if you want (2):


$area_code = qr/\(\d{3}\)/;


$regex = qr/^($area_code )?\d{3}-\d{4}$/;


or even perhaps (3):


$area_code = qr/\(\d{3}\)/;
$local_number = qr/\d{3}-\d{4}/;
$regex = qr/^($area_code )?$local_number$/;


or if you want to decorate with comments (4):


$regex = qr/^ # match at beginning of string
( # parenthesis to group together
\(\d{3}\) # area code
\ )? # separated by space, and optional
\d{3}-\d{4} # local number
$ # match at end of string
/x; # /x makes it an extended regexp.


or with the previously defined variables:


$regex = qr/^ # match at beginning of string
($area_code )? # optional area_code
$local_number
$ # match at end of string
/x; # /x makes it an extended regexp.


The choice of which you prefer probably depends on how much of an
expert you are. I prefer 2 or 3, although in general for a regexp
this short, I would just write 1. I find 4 and 5 to be way too long,
but perhaps for beginners they would be better. And here we see one
instance of the problem that I've found before, that different levels
of programmers really want to write/read different representations of
programs.


I would also imagine that for John's comment about BNF forcing a full
stack parser that it would be "a simple matter of programming" to
write a BNF parser that would auto-convert restricted BNF grammars
down into the equivalent automaton.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.