Tue, 28 Nov 1995 08:26:05 GMT

Related articles |
---|

[9 earlier articles] |

Re: regular expressions wendt@CS.ColoState.EDU (1993-03-22) |

Regular Expressions rafae1@hp.fciencias.unam.mx (trejo ortiz alejandro augusto) (1995-10-16) |

Re: Regular Expressions mnp@compass-da.com (Mitchell Perilstein) (1995-10-23) |

Re: Regular Expressions cgh@cs.rice.edu (1995-10-29) |

Re: Regular Expressions odunlain@maths.tcd.ie (Colm O'Dunlaing) (1995-10-31) |

Re: Regular Expressions natasha@softlab.ece.ntua.gr (1995-11-03) |

Re: Regular Expressions sjmccaug@prairienet.org (1995-11-28) |

Re: Regular Expressions jmccarty@spdmail.spd.dsccc.com (1995-11-29) |

Newsgroups: | comp.compilers |

From: | sjmccaug@prairienet.org (Scott J. McCaughrin) |

Keywords: | DFA |

Organization: | University of Illinois at Urbana |

References: | 95-11-033 95-10-087 |

Date: | Tue, 28 Nov 1995 08:26:05 GMT |

In a previous article, odunlain@maths.tcd.ie (Colm O'Dunlaing) says:

*>1) Is there a canonical form for regular expressions(over a finite alphabet)?*

*> Yes, sort of. Convert a given regular expression R to a*

*> nondeterministic finite automaton N, then convert N to a*

*> determininistic finite automaton D, then apply minimisation to*

*> D to construct a minimal d.f.a. M. The automaton M is unique*

*> for the regular set it accepts, so M can be taken as a canonical*

*> form for the regular expression (I wouldn't bother about converting*

*> M back to a regular expression).*

This sounds more like a recipe for building the recognizer. So far as r.e.'s

per se are conscerned, there is no such stipulation as "canonical form".

*> Conversion from N to D may increase the number of states*

*> exponentially. So the canonical form could be very*

*> large. It is almost certain that any other choice of*

*> canonical form could also be very large, since the*

*> problem: Are R and S equivalent? (where R and S are*

*> regular expressions) is PSPACE complete.*

*>*

*> See*

*> 'Introduction to automata theory, languages, and computation,'*

*> by Hopcroft and Ullman.*

In any event, this is certainly not how r.e. recognizers are built in

practice.

*>3) An ambiguous grammar for regular expressions over the alphabet {a, b}is the*

*>following:*

*> R::=RR | R + R | R* | (R) |a|b*

*>The question is: How can I state an unambiguous grammar for regular*

*>expressions?*

*>*

*> (You probably need two more symbols \emptyset and \lambda*

*> for the empty set and the empty string. However...)*

Well, \emptyset can be provably excluded from the set recognized by a

d.f.s.a. (in fact, even from the n.f.s.a.)

*>*

*> I think, but am not absolutely certain, that the following*

*> grammar will work. Implicitly, it prioritises operators*

*> in the order * > concatenation > +, and orders evaluation*

*> from right to left (I think).*

*>*

No it doesn't.

*> A::= (R)|a|b*

*> B::=A*|B**

*> C::=AR|BR*

*> D::=A+R|B+R|C+R*

*> R::=A|B|C|D*

*>*

How can it prioritize operators if B,C,D are all top-level yet each

denotes a different operation? For starters, you need the start-symbol

(R) to produce only right-hand sides involving the lowest-priority ops.

But hey -- nice try!

-- Scott McC.

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.