Tue, 31 Oct 1995 16:19:10 GMT

Related articles |
---|

[7 earlier articles] |

Re: Regular Expressions Martin.Ward@durham.ac.uk (Martin Ward) (2004-10-17) |

Re: Regular Expressions choksheak@yahoo.com (ChokSheak Lau) (2004-10-21) |

Re: regular expressions wendt@CS.ColoState.EDU (1993-03-22) |

Regular Expressions rafae1@hp.fciencias.unam.mx (trejo ortiz alejandro augusto) (1995-10-16) |

Re: Regular Expressions mnp@compass-da.com (Mitchell Perilstein) (1995-10-23) |

Re: Regular Expressions cgh@cs.rice.edu (1995-10-29) |

Re: Regular Expressions odunlain@maths.tcd.ie (Colm O'Dunlaing) (1995-10-31) |

Re: Regular Expressions natasha@softlab.ece.ntua.gr (1995-11-03) |

Re: Regular Expressions sjmccaug@prairienet.org (1995-11-28) |

Re: Regular Expressions jmccarty@spdmail.spd.dsccc.com (1995-11-29) |

Newsgroups: | comp.compilers |

From: | Colm O'Dunlaing <odunlain@maths.tcd.ie> |

Keywords: | DFA, theory |

Organization: | Dept. of Maths, Trinity College, Dublin, Ireland. |

References: | 95-10-087 |

Date: | Tue, 31 Oct 1995 16:19:10 GMT |

1) Is there a canonical form for regular expressions(over a finite alphabet)?

Yes, sort of. Convert a given regular expression R to a

nondeterministic finite automaton N, then convert N to a

determininistic finite automaton D, then apply minimisation to

D to construct a minimal d.f.a. M. The automaton M is unique

for the regular set it accepts, so M can be taken as a canonical

form for the regular expression (I wouldn't bother about converting

M back to a regular expression).

Conversion from N to D may increase the number of states

exponentially. So the canonical form could be very

large. It is almost certain that any other choice of

canonical form could also be very large, since the

problem: Are R and S equivalent? (where R and S are

regular expressions) is PSPACE complete.

See

'Introduction to automata theory, languages, and computation,'

by Hopcroft and Ullman.

3) An ambiguous grammar for regular expressions over the alphabet {a, b}is the

following:

R::=RR | R + R | R* | (R) |a|b

The question is: How can I state an unambiguous grammar for regular

expressions?

(You probably need two more symbols \emptyset and \lambda

for the empty set and the empty string. However...)

I think, but am not absolutely certain, that the following

grammar will work. Implicitly, it prioritises operators

in the order * > concatenation > +, and orders evaluation

from right to left (I think).

A::= (R)|a|b

B::=A*|B*

C::=AR|BR

D::=A+R|B+R|C+R

R::=A|B|C|D

Yours, Colm O Dunlaing (Trinity College, Dublin, Ireland)

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.