Re: Extensible grammars

Ralf Laemmel <Ralf.Laemmel@cwi.nl>
23 Dec 2004 21:37:30 -0500

          From comp.compilers

Related articles
Extensible grammars skaller@nospam.com.au (John Max Skaller) (2004-12-13)
Re: Extensible grammars claus.reinke@talk21.com (Claus Reinke) (2004-12-16)
Re: Extensible grammars skaller@nospam.com.au (John Max Skaller) (2004-12-17)
Re: Extensible grammars skaller@nospam.com.au (John Max Skaller) (2004-12-17)
Re: Extensible grammars Ralf.Laemmel@cwi.nl (Ralf Laemmel) (2004-12-23)
| List of all articles for this month |

From: Ralf Laemmel <Ralf.Laemmel@cwi.nl>
Newsgroups: comp.compilers
Date: 23 Dec 2004 21:37:30 -0500
Organization: Compilers Central
References: 04-12-058 04-12-070 04-12-086
Keywords: parse
Posted-Date: 23 Dec 2004 21:37:30 EST

John Max Skaller replied to Claus Reinke who wrote about generic
traversal and Strafunski:


>BTW: Barry Jay has a new programming language that can do this
>without any external machinery (eg in a term for an expression you
>can just sum all the integer constants with a fold, it automatically
>finds all the int terms) -- and 4 other kinds of polymorphism.


((( This discussion has thereby entered the terrain of generic
programming, which has more to offer than Strafunski and Barry Jay's
calculi and languages. I am Haskell-biased :-) and in fact so much
work on generic programming (including generic traversal) has been
done for Haskell. So please find some typical references [1],[2],[3]
below. BTW, Jay's work on FISh and constructor calculus is
outstanding, but Haskell and extensions thereof has certainly also a
respectable *number* of polymorphisms to offer :-) first-class
polymorphism, rank-1, rank-2, ... polymorphism, parameteric
polymorphism, type-class bounded polymorphism, a bit of intensional
polymorphism (polytypism, ..., in some form or another via the generic
programming extensions that are around, some being provided as
language extensions; no external machinery!), subtype polymorphism (by
transitive closure, not widely known but still true). Did I mention
that I like Haskell? )))


Back to the original topic: open recursion for grammar extension.


Open recursion is convenient in OCaml in several ways. Most notably,
it is great for the purpose of enabling inheritance when using OCaml's
object system. I am sure it can be used conveniently. I think most
appealing examples deal with *one* extensible type (e.g., _the_
class), so there is just one parameter. Also, OCaml has (equi-)
recursive types (not to be confused with recursive datatypes being
modelled). This takes away some of the possible hassle that occurs
when fixating by means of an explicit Fix-point datatype constructor
(as people tend to do in Haskell whose type system excludes recursive
types for some good reasons).


This sort of open recursion style is not really in use for
context-free grammars (as far as I know; _other_ sorts extensible
context-grammars have been investigated; see some comments below). But
open recursion has been celebrated for algebraic
datatypes. (Context-free grammars enjoy a simple algebraic
interpretation!!!) The idea goes back to the mid-1980ies, see [4-5]
for some old references, this technique is used also more recently
[6], and it is also useful in the context of type systems that go
beyond algebraic datatypes [7].


I consider using "functorial style" datatypes, which I think is a good
name for it, as somewhat of a *hassle*. It's great if your programming
problem involves just one datatype, but otherwise the amount of extra
parameters (one per constructor component, or at least per
constructor-compont type) is a pain; you get into position problems
and all that. (As an implementation platform, functorial style might
be Ok, say I don't see it being used as a user.)


Also, do we need this sort of parameterisation hassle for grammars? I
doubt it. Let's adopt an object-oriented view on the problem. There
are several notions of OOish context-free grammars (and formalisms on
top of it such as attribute grammars), see Koskimies' work for example
[8]. Why would we want to _explicitly_ parameterise in nonterminal
symbols? Let's just stand the nonterminals for scoped parameters
_themselves_ --- just like static references to abstract classes in an
OO program. No need to fixate! Extension is easy; just write down
another production! I am not a fan of OO, but here it scores. (I will
discuss below that Haskell can do extensibility without OO; it can do
with class.)


Let me also challenge the overall notion of extensible grammars and
perhaps the implied wish for language support for them. In my
experience and in the application domains that I have looked into
(e.g., software re-engineering frontends) grammar evolution too often
goes beyond grammar extension *anyhow*, but more arbitrary grammar
_adaptation_ is needed very often. So principled support for grammar
extension does perhaps only cover a too small part of the problem, so
does open-recursion in the sense of functorial style programming.


However, I find it very interesting to think of language support for
even grammar adaptation including grammar extension, and I would like
to mention my work on foundations and applications of grammar
adaptation, which in some form could lead to language support, but all
current implementations are more like off-line meta-programming tools,
which are then still better than the kind of copy-paste-revise modus
operandi that you correctly identified as a problem; see [9] or more
generally [10]. (There are many more people doing related stuff,
e.g., James R. Cordy [11]) When I use the term "adaptation" I mean to
include refactoring, semantics-increasing, -decreasing, and -replacing
transformation, omitting definitions for those terms, and even
references to more papers.


Let me go on by saying that one *can* get a little bit more out of
"extension" once we stretch this term to include some reasonable forms
of modular composition, in particular: some forms of join or merge and
definitely overriding. Say, we will perhaps not be able to adapt
productions, but at least each productions and all the functionality
that goes with them can be in one small module subject too many
different compositions. And overriding allows us to replace
inappropriate productions. So we are BACK TO OO?


I really appreciate the poster's question because it inspired me to
realise that Haskell readily provides extensible datatypes --- it is
just not widely known. Here is the standard vs. extensible encoding of
some tiny interpreter example.


http://homepages.cwi.nl/~ralf/OOHaskell/src/interpreter/nonextensible.hs
http://homepages.cwi.nl/~ralf/OOHaskell/src/interpreter/extensible.hs


It does not *use* functorial style datatypes. No hassle then.
(Neither does it use OOP in Haskell, which it could.) It rather uses
Haskell type classes. It is perhaps a bit verbose but not much more
than equivalent C++ code. (Is that true?) Syntactic sugar could
dramatically improve matters, and we are working on that. This style
has *some* problems admittedly [12].


One final note on the link between generic traversal and extensible
grammars:


I would like to add that robustness of program code in the view of
grammar extensions is perhaps more important than actually having
language support for extensible grammars. (Say, changing the grammar
and recompiling the whole app is Ok; needing to change the program
just because some conceptually irrelevant constructor was added is not
Ok.) Generic programming/traversal achieves robustness by means of
conciseness; it only talks about the constructors that one should
really care about. Sadly, conciseness can also lead to inconsistent
traversal in the sense that we should have added a case to a generic
traversal algorithm (say to a visitor in OO terms), but we were not
reminded of this fact because a generic default is just assumed
generally. This problem has been identified and discussed by the
Adaptive programming people (Karl Lieberherr et al.).


Regards,
Ralf


  [1]
    The Scrap Your Boilerplate approach
    http://www.cs.vu.nl/boilerplate/
    (Readily supported by the GHC compiler for Haskell, no external machinery,
      BTW this is a spinoff of Strafunski!)


[2]
    The Generic Haskell project
    http://www.generic-haskell.org/
    (This is a proper and susbtantial language extension of Haskell)


[3]
    Polytypic programming extensions at Chalmers
      http://www.cs.chalmers.se/~patrikj/poly/
    (There is also a recent one for Haskell indeed.)


[4]
@PhdThesis{HaginoThesis,


    author = "Tatsuya Hagino <http://liinwww.ira.uka.de/mpsbib/index?query=HaginoT&partial=on&case=on&results=&maxnum=200>",
    title = "*A Category Theoretic Approach to Data Types*",
    school = "University of Edinburgh, Department of Computer
                                  Science",
    checked = "Yes",
    year = "1987",
    note = "CST-47-87 (also published as ECS-LFCS-87-38)",
}




[5]
@Book{Manes86,


    author = "E. G. Manes <http://liinwww.ira.uka.de/mpsbib/index?query=ManesEG&partial=on&case=on&results=&maxnum=200> and M. A. Arbib <http://liinwww.ira.uka.de/mpsbib/index?query=ArbibMA&partial=on&case=on&results=&maxnum=200>",
    title = "*Algebraic Approaches to Program Semantics*",
    series = "Texts and Monographs in Computer Science",
    publisher = "Springer-Verlag",
    address = "Berlin",
    year = "1986",
    keywords = "parallel architectures, independent,",
}




[6] *
*Designing and Implementing Combinator Languages (1999)
http://www.cs.uu.nl/~doaitse/Papers/1999/AFP3.pdf
E.g. Page 171




[7]
Bananas in Space: Extending Fold and Unfold to Exponential Types (1995)
http://citeseer.ist.psu.edu/293490.html


[8]
@INPROCEEDINGS{Koskimies91,
  AUTHOR = "K.~Koskimies",
  TITLE = "{O}bject {O}rientation in {A}ttribute {G}rammars",
  PAGES = "297--329",
  CROSSREF = "SAGA91"
}


@proceedings{SAGA91,
  EDITOR = "H.~Alblas and B.~Melichar",
  BOOKTITLE = "{Proc.\ International Summer School on Attribute grammars,
Applications and Systems (SAGA'91), Prague, Czechoslovakia}",
  YEAR = 1991,
  MONTH = Jun,
  SERIES = "LNCS",
  VOLUME = "545",
  PUBLISHER = "Springer-Verlag"
}


[9]
Grammar adaptation
http://homepages.cwi.nl/~ralf/fme01/


[10]
Grammarware engineerng -- the agenda
http://www.cs.vu.nl/grammarware/agenda/
More generally:
http://www.cs.vu.nl/grammarware/


[11]
@article{DCMS03,
    author = "T.R.~Dean and J.R.~Cordy and A.J.~Malton and K.A.~Schneider",
    title = "{Agile Parsing in TXL}",
    journal = "Journal of Automated Software Engineering",
    volume = 10,
    number = 4,
    month = oct,
    year = 2003,
    pages = "311--336"
}


[12]
  From a draft that I would like to work on: (i) we would prefer an
implementation of Haskell's type classes, which can be instructed to
exhibit less detailed types but rather type *classes* when looking at
inferred or reported types; (ii) some optimisations such as
memoisiation or delayed dictionary construction would make sure that
the conceptually giant class dictionaries do not hamper
scalability. (This is actually slighly less of a problem than in
general type-class programming, where one writes more arbitrary
programs at the type level. That coding style would benefit from even
more advanced improvements of present Haskell implementations.)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.