# re: Grammar Algebra?

## Keith Clarke <keithc@dcs.qmw.ac.uk>Mon, 2 Nov 1992 19:36:50 GMT

From comp.compilers

Related articles
Re: Grammar Algebra? hallpwcd@ucunix.san.uc.EDU (1992-10-31)
re: Grammar Algebra? keithc@dcs.qmw.ac.uk (Keith Clarke) (1992-11-02)
Re: Grammar Algebra? Peter.Breuer@prg.oxford.ac.uk (1992-11-04)
| List of all articles for this month |

 Newsgroups: comp.compilers From: Keith Clarke Organization: Compilers Central Date: Mon, 2 Nov 1992 19:36:50 GMT References: 92-10-126 92-10-122 Keywords: parse, theory

John Levine says that union of two grammars is easy, give or take problems
of ambiguity. Ambiguity is easy if you don't mind parsers that return
lists (sets) of possible parses, & this makes intersection easy too.

M. Anton Ertl writes:

|Very interesting, as we can use intersection to construct all-time
|favourites like a^n b^n c^n:
|
|S = A intersect B
|A = A1 c*
|A1 = a A1 b
|B = a* B1
|B1 = b B1 c

I just added a few lines to my collection of parsing functionals and can now
interpret input like:

(triple (aChar 'a') (aChar 'b') (aChar 'c')) "aabbcc"

The function "triple" takes three parsers p, q and r (in the example,
these recognise the languages {a}, {b} and {c} respectively) and returns a
parser for p^n q^n r^n, implemented as Anton suggests.

My parsing functions are copied from/inspired by Graham Hutton "Parsing
using Combinators", in Functional Programming Glasgow 1989, ed. Davis &
Hughes, publ Springer.

Intersection works in a set-like way, on the "set of interpretations"
produced by the alternatives. Only legal parses have an interpretation, &
it is only interesting when the interpretations "flatten" the parse tree
somehow - otherwise the two grammars seem certain to give different parse
trees & hence an empty intersection.

===========miranda script follows============

tokens == [char]
result * == (*,tokens)
parser * == tokens ->[result *]

|| The sequential combinator \$then returns pairs, which need tidying up
|| from time to time by this function
interpret::parser * -> (*->**) -> parser **
interpret f g ts = [(g fvalue,rest) | (fvalue,rest) <- f ts]

|| Concatenation
then::parser * -> parser ** -> parser (*,**)
then f g ts
= [((fval,gval),grest) | (fval,frest) <- f ts; (gval,grest)<-g frest]

aChar::char->parser char
aChar s [] = []
aChar s (t:toks)
= [(s,toks)], if s=t
= [], otherwise

|| union - the two languages have to have the same kind of interpretation
else:: parser * -> parser * -> parser *
else f g cs
= f cs ++ g cs

|| intersection: treat the (interpretation,remaining input) lists as sets
intersect:: parser * -> parser * -> parser *
intersect f g cs
= f cs \$inter g cs
where inter ps [] = []
inter [] ps = []
inter (x:xs) ys = x:inter xs ys, if member ys x
= inter xs ys, otherwise

|| always succeeds, consumes no input - recognises the empty language
succeed:: * -> parser *
succeed x cs = [(x,cs)]

|| many f parses the language f*
|| When the input looks like n consecutive fs, it returns a list
|| of (n+1) legal interpretations..
many:: parser * -> parser [*]
many f = ((f \$then many f) \$interpret cons) \$else succeed []
where cons (x,xs) = x:xs

|| Solve a^n b^n c^n
triple:: parser * -> parser ** -> parser *** -> parser ([*],[**],[***])
triple a b c
= ((many a \$then bc) \$interpret bar1)
\$intersect
((ab \$then many c) \$interpret bar2)
where bc = ((b \$then bc \$then c) \$interpret foo) \$else succeed []
ab = ((a \$then ab \$then b) \$interpret foo) \$else succeed []
foo (b,(bcs,c)) = (b,c):bcs
bar1 (as,bcs) = (as,bs,cs) where (bs,cs) = unzip bcs
bar2 (abs,cs) = (as,bs,cs) where (as,bs) = unzip abs
unzip [] = ([],[])
unzip ((x,y):xys) = (x:xs,y:ys) where (xs,ys) = unzip xys
--

Post a followup to this message