Re: Compiler with adjustable parsers

webber@athos.rutgers.edu (Bob Webber)
7 Mar 90 08:57:06 GMT

          From comp.compilers

Related articles
[3 earlier articles]
Re: Compiler with adjustable parsers steve@hubcap.clemson.edu (1990-03-01)
Re: Compiler with adjustable parsers dhw@itivax.iti.org (1990-03-02)
Re: Compiler with adjustable parsers pgl@cup.portal.com (1990-03-15)
Re: Compiler with adjustable parsers ipser@vaxa.isi.edu (Ed Ipser) (1990-03-03)
Re: Compiler with adjustable parsers PIRINEN@CC.HELSINKI.FI (Pekka P. Pirinen) (1990-03-03)
Re: Compiler with adjustable parsers fineman@ptolemy.arc.nasa.gov (Charles Fineman) (1990-03-05)
Re: Compiler with adjustable parsers webber@athos.rutgers.edu (1990-03-07)
Re: Compiler with adjustable parsers aarons@cogs.sussex.ac.uk (Aaron Sloman) (1990-03-08)
| List of all articles for this month |

From: webber@athos.rutgers.edu (Bob Webber)
Newsgroups: comp.compilers
Keywords: parse
Date: 7 Mar 90 08:57:06 GMT
References: <1990Mar1.143905.14553@esegue.segue.boston.ma.us> <8200@hubcap.clemson.edu>
Organization: Rutgers Univ., New Brunswick, N.J.

In article <8200@hubcap.clemson.edu>, steve@hubcap.clemson.edu ("Steve" Stevenson) writes:
> hackeron@ATHENA.MIT.EDU writes:
>
> >Does anyone know of a compiler/language that allows you to specify changes
> >to how the language is parsed (in part at least) from withing the program.
>
> Way back in the Jurassic age, there was a lot of work by Cheatham and
> Wegbreit in ``extensible languages.'' Sorry, no direct references come to
> mind, although I seem to recall a CACM article by Wegbreit on ``EL1''


Notation is an interesting problem. What does it mean to be a high level
language? You start off with something like assembly. You add a few macros
for commonly used constructs. You end up short on two counts: 1) compile-time
syntax checking -- balanced parenthesis, nested loops, etc and 2) optimization
-- register allocation, instruction choice, ... . You make a spiffy macro
processor and compile-time syntax checking is handled. But you are still left
with code quality problems. So you start with a simple optimizer that uses
language specific information to guide the optimizer. Optimizer technology
improves. You end up with optimizers that no longer pay much attention to
language knowledge and do global analysis again. Perhaps we can now go back
to assemblers with fancy macro packages using the optimizer as a
postprocessing pass.


Macros can get you a certain distance in regular languages. Using m4 with C
gives one enough rope to hang oneself if one is really interested. SAIL's
macro package was rather aggressive along these lines too. An algolized-lisp
interface called mlisp used operator-precedence grammars and let the user
introduce new operators with new precedence levels which handles the syntax
nicely (although operator-precedence grammars seem to have gone out of style).
Of course, it isn't really necessary to put this in the language. One could
always write awk scripts that generated C code. Or use yacc and lex and
generate a new notation like C++.


The tools have been around for a long time for a programmer to establish any
programming notation they want. But they tend to go unused. Two reasons: 1)
takes a lot of thought to analyze a problem to establish a useful notation for
programming it and 2) the chicken-and-egg problem -- people don't learn
programming this way, so they don't do it.


Of course there are two other issues to consider: portability and
maintainability. Assuming the notation the programmer creates is problem
specific rather than machine specific, this shouldn't be much of a problem,
particularly if one uses something like C as the target semantics (or more
specifically, porting would be no worse than porting C code usually is).
Maintainability means you have to convince others that your notation doesn't
just get in their way, but actually makes the problem easier to understand.
Bad notations are like bad designs -- they are hard to maintain and probably
shouldn't be maintained.


--- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)


p.s., Incidently, the EL1 information was written up in a book from which I
made the following notes:


Studies in Extensible Programming Languages
Ben Wegreit
Garland Publishing, Inc.
New York & London, 1980
ISBN 0-8240-4423-1


                                  . . . a good notation has a subtlety and
                                  suggestiveness which at times make it
                                  seem almost like a live teacher . . .
                                  a perfect notation would be a substitute
                                  for thought.
                                                                                    Bertrand Russell
                                          in the introduction to Wittgenstein's
                                                        Tractatus Logico-Philosophicus




    ``nested `continua' of languages. In such systems, new languages may
        be embedded, appended, extracted at will.''


      J. Smith, ``Syntatic and Semantic Augments to ALGOL,'' CACM 3, 4
      (April 1960), 211-213.


      ``What is needed is an extremely powerful and generalized language,
          but stripped down to the utmost, stripped down to the possibilities
          that it can, in itself, by a sort of procedure declaration, declare
          the rest of the mechanism needed.''


      W. L. van der Poel, Speaking during a panel discussion: ``Reflections
      from Processor Implementors on the Design of Languages,'' in Symbolic
      Languages in Data Processing (Proceedings of the Symposium of the
      International Computation Center, Rome, 1962), Gordon and Breach Science
      Publishers, New York, 1962.


      ``It is absolutely essential that this new language should have the
          facilities of using new syntactic forms.''


      C. Strachey, Speaking during a panel discussion: ``Is a Unification
      ALGOL-COBOL ALGOL-FORTRAN Possible? The Question of One or Several
      Languages,'' in Symbolic Languages in Data Processing (Proceedings of
      the Symposium of the International Computation Center, Rome, 1962),
      Gordon and Breach Science Publishers, New York, 1962.


      as an example, the following piece of notation was given:
            operator op priority + < op < x means ...
      for defining an operator op with a priority between + and x.


      also, look up:
      B. A. Galler and A. J. Perlis, ``A Proposal for Definitions in ALGOL,''
      CACM 10, 4 (April 1967), 204-219.


      Pages 19 thru 120 deal with properties of extensible context-free
      grammars. A parsing algorithm based on a modification of Earley's
      algorithm is given. Essentially, at any stage during the parse,
      the set of productions is the initial set plus any productions that
      can be extracted by a given finite state transducer from the portion
      of the sentential form that is to the left of the leftmost nonterminal
      symbol. There is a nesting scope to the applicability of a particular
      ``new'' rule and notation for removing rules from the set usable at
      a particular point in the parse as well as for adding them in.





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.