Re: Languages that are hard to parse

Hans-Peter Diettrich <DrDiettrich@compuserve.de>
18 May 2005 00:49:24 -0400

          From comp.compilers

Related articles
Re: C++ intermediate representation. comeau@panix.com (2005-05-15)
RE: C++ intermediate representation. quinn-j@shaw.ca (Quinn Tyler Jackson) (2005-05-15)
Languages that are hard to parse steve@rh12.co.uk (Steve) (2005-05-16)
Re: Languages that are hard to parse DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-05-18)
Re: Languages that are hard to parse gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-05-18)
Re: Languages that are hard to parse Peter_Flass@Yahoo.com (Peter Flass) (2005-05-19)
Re: Languages that are hard to parse gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-05-20)
Re: Languages that are hard to parse DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-05-20)
Re: Languages that are hard to parse henry@spsystems.net (2005-05-21)
Re: Languages that are hard to parse gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-05-22)
[9 later articles]
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich@compuserve.de>
Newsgroups: comp.compilers
Date: 18 May 2005 00:49:24 -0400
Organization: Compilers Central
References: 05-05-119 05-05-125 05-05-147
Keywords: parse, design
Posted-Date: 18 May 2005 00:49:24 EDT

Steve wrote:


> Are you saying that designing hard to parse languages is a good thing?


*How* are languages designed?


IMO new programming languages are designed for easy parsing, by using
either a grammar based parser generator, or in handwritten code
without a (proven) grammar. Derivates of "un-formal" legacy languages,
like C, fall into their own category (homebrew trouble, dead end ;-)


So the question is not about "hard to parse", instead it should read
"formal design - yes or no".


In so far it's a good idea to provide parser generators with the best
possible support for grammar analysis, so that, in the ideal case,
nobody will ever more try to write an handcrafted parser for his
language.


AFAIK even a good grammar analyzer cannot determine whether a language
is ambiguous - at least not for higher level grammars. This would mean
to restrict future parser generators to grammars that can be formally
verified. The parser generation then is less important, as long as the
user will know that a big or slow parser is the result of his own
misdesign, and does not indicate an poor parser generator whose output
deserves manual optimization.


For derivates of C, or other legacy languages, I'd suggest an source
code converter that can translate all legacy code into a simpler and
more formal language, and to restart language evolution from that base.


In so far I feel no need or reason for more sophisticated parser
generators. But, as Quinn Tyler Jackson explained to me, there exist
more than only programming languages. The existence of natural languages
should be obvious, less obvious are other natural phenomenons of
possibly "formalizeable" nature. This is a vast area for parsing
techniques and translator generators, be new or already under
development. Appropriate tools should allow to express a "feeling for
some underlying order" in a formal way, and to subsequently verify the
language design by feeding a test suite into the created parser. In such
cases again the question is not about "hard to parse" languages, it's
only a question of what's feasable at all. Context sensitive or possibly
irregular languages deserve new descriptive formalisms, not covered by
traditional (BNF...) grammars and tools.


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.