Mon, 16 Nov 1992 20:00:27 GMT

Related articles |
---|

Theoretical CFG question norlin@essex.ecn.uoknor.edu (1992-11-13) |

Re: Theoretical CFG question jos@and.nl (1992-11-15) |

Re: Theoretical CFG question pab@cs.arizona.edu (1992-11-16) |

Re: Theoretical CFG question jos@and.nl (1992-11-17) |

Newsgroups: | comp.compilers |

From: | pab@cs.arizona.edu (Peter A. Bigot) |

Organization: | U of Arizona CS Dept, Tucson |

Date: | Mon, 16 Nov 1992 20:00:27 GMT |

References: | 92-11-067 |

Keywords: | parse |

norlin@essex.ecn.uoknor.edu (Norman Lin) writes:

*>The problem was:*

*> **

*>For i>=1, let b(i) denote the string in 1(0+1) that is the*

*>binary representation of i.*

*>*

*>Construct a CFG generating*

*> +*

*> {0,1,#} - {b(1)#b(2)# ... #b(n) | n>=1}*

I'll refrain from specific comments about a professor who isn't interested

in addressing questions about problems nobody in a class can solve (did

you try asking hir in office hours?).

This is problem 4.3 in Hopcroft and Ullman. Note that although the

complement of the language is not a CFL (pump on it), CFL's aren't closed

under complement; they are closed under union, though, so you can simply

define CFGs for each way a string could fail to be in the form

b_1#b_2#...#b_n.

I've attached my own solution to this from a couple years back (this is

ugly TeX before I switched to cleaner LaTeX; the idea should be present).

The conclusion of the class (and the professor), however, was that the

problem was way too hard; and if the even numbered b_i were given in

reverse, the resulting grammar would be feasible (easier to generate

incorrect mirror images). As noted, the actual grammar below is not

correct.

\def\re#1{

\ifmmode{\rm\bf #1}

\else{\bf #1}

\fi}

\proofsect {Method}: The idea is to split $L$ into several languages, each

of which can be constructed using a context free grammar, then merging the

grammars to form one for $L$.

{\advance\leftskip by 0.25in

\item{1} Strings which do not begin with $b_1$ are not in the language.\par

\item{2} Strings in which one of the purported $b_i$ does not begin with a

$\re1$ are not in the language.\par

\item{3} Strings in which some sequence purported to be $b_i\#b_{i+1}$ has an

error are not in the language. I assume that $b_i$ is well formed, and the

error is in $b_{i+1}$; without this assumption, correctness is harder to

ensure (variations in $b_i$ may make the supposedly ill-formed $b_{i+1}$

correct). This set can be further subdivided:\par

{\advance\leftskip by 0.25in

\item{3a} If we assume that $b_i = \re1^n$, then $b_{i+1}$ should be

$\re1\re0^n$. The error is either in the leading \re1 or in the

sequence of $n$ \re0s of $b_{i+1}$.\par

\item{3b}. Otherwise, $b_i$ is of the form $\rho01^n$ where $n \ge 0$.

$b_{i+1}$ should then be $\rho10^n$. The error can occur in four places:

extra characters before the prefix, or an incorrect character match in

the prefix, the \re1 immediately following the prefix, or the trailing

\re0s.\par

*}}*

\hang A grammar which was derived from this partitioning is attached.

Nonterminals are enclosed in angle brackets, and productions can be combined

using a vertical bar when the nonterminal being rewritten is the same.

(Although this grammar does not seem to generate strings which are not in $L$,

there are strings of $L$ which it will not generate. Right idea; wrong

details---this really is a horrible problem.)

// Solution to problem 4.3, Hopcroft & Ullman

//

01# // symbols in CFG

<S> // External nonterminals

<S> // Start symbol

<S> -> <L1> // $w where w != (1 or 1#)

| <L2> // #w where w [1] != 1

| <H><L3><T> // there is a bi#bj where i+1 <> j in the mix

// Odds and ends--sequences of ones, digits, numbers, anything

<O> -> 1<O>

| //epsilon

<D> -> 1

| 0

<N> -> <D><N>

| //epsilon

<Q> -> 0

| 1

| #

<W> -> <Q><W>

| //epsilon

// Sequence starts with something other than b1

<L1> -> 0<W>

| #<W>

| 1<D><W>

// There is an ill-formed integer somewhere in the mix

<L2> -> <W>#0<W>

| <W>##<W>

// Head and tail of string: ensure that <H>w<T> makes w a bi#bj sequence.

<H> -> <W>#

| //epsilon

<T> -> #<W>

| //epsilon

<L3> -> <La> // Inverse of 1^n#10^n, intersect 1*#<W>

| <Lb> // Inverse of p01^n#p10^n

<La> -> 1<La>0 // 1s and 0s still balance

| 1<La_F>1 // Bad 1 on right

| #0<N> // Error--0 starts right

| #1<D><N> // Too much on right

| <O># // Too many 1s on left

<La_F> -> 1<La_F><Q> // Maintain balance of 1s/0s

| #<N> // Tack on whatever you want on right

// Inverse of p01^n#p10^n

// Assume error in prefix: 0 on left, 1 on right

<P-L-A0> -> <D><P-L-A0><D> | 0 // Error is center 0; generate prefix around it

<P-L-B0> -> <D><P-L-A0>0 // Generate 0 following prefix

<P-L-C0> -> <D><P-L-C0>1 | <P-L-B0> // Generate trailing 1s in left side

<P-L-D0> -> <D><P-L-C0>#<N> // Generate separating hash

<P-L-E0> -> <D><P-L-E0><D> | <P-L-D0> // Generate upper end of right prefix

<P-L-A1> -> <D><P-L-A1><D> | 1 // Error is center 1; generate prefix around it

<P-L-B1> -> <D><P-L-A1>0 // Generate 0 following prefix

<P-L-C1> -> <D><P-L-C1>1 | <P-L-B1> // Generate trailing 1s in left side

<P-L-D1> -> <D><P-L-C1>#<N> // Generate separating hash

<P-L-E1> -> <D><P-L-E1><D> | <P-L-D1> // Generate upper end of right prefix

<P-R-A1> -> <D><P-R-A1><D> | 1 // Error is center 1; generate prefix around it

<P-R-B1> -> <N>#<P-R-A1><D> // Separating hash

<P-R-C1> -> 1<P-R-C1><D> | <P-R-B1> // Tail 1s of left side

<P-R-D1> -> 0<P-R-C1><D> // After prefix 0 on left side

<P-R-E1> -> <D><P-R-E1><D> | <P-R-D1> // Right edge of left prefix

<P-R-A0> -> <D><P-R-A0><D> | 0 // Error is center 0; generate prefix around it

<P-R-B0> -> <N>#<P-R-A0><D> // Separating hash

<P-R-C0> -> 1<P-R-C0><D> | <P-R-B0> // Tail 1s of left side

<P-R-D0> -> 0<P-R-C0><D> // After prefix 0 on left side

<P-R-E0> -> <D><P-R-E0><D> | <P-R-D0> // Right edge of left prefix

// Assume error at first zero after prefix: pairs with 0 on right

<C-L-A0> -> <D><C-L-A0>1 | 0 // 0 after prefix

<C-L-B0> -> <D><C-L-A0># // Trailing 1s plus hash

<C-L-C0> -> <D><C-L-C0><D> | <C-L-B0> // Part of right prefix

<C-R-Z0> -> <D><C-R-Z0>0 | 0 // Make "1" after prefix a 0

<C-R-Y0> -> #<C-R-Z0>0 // Separating hash

<C-R-X0> -> 1<C-R-X0>0 | <C-R-Y0> // Last 1s of left side

// Assume error in trailing 1s on left side: pairs with 1 on right

<T-A> -> 1<T-A><D> // Last 1s of left tail, last Xs of right tail

| #<N>1 // Sep, prefix, 0, 1s, and error

<Lb> -> <P-L-A0><P-R-E1> // Break before end of left prefix

| <P-L-C0><P-R-C1> // Break somewhere in trailing 1s on left

| <P-L-E0><P-R-A1> // Break somewhere in left of right prefix

| <P-L-A1><P-R-E0> // Break before end of left prefix

| <P-L-C1><P-R-C0> // Break somewhere in trailing 1s on left

| <P-L-E1><P-R-A0> // Break somewhere in left of right prefix

// Error in first zero after left side prefix

| <C-L-A0><C-R-X0> // Break on left side

| <C-L-C0><C-R-Z0> // Break on right side

// Error in trailing 1s of left side after prefix

| <N>1<T-A>

--

Peter A. Bigot -- pab@cs.arizona.edu

Dept. of Computer Science, University of Arizona, Tucson AZ

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.