Re: Infinite look ahead required by C++?

Chris F Clark <cfc@shell01.TheWorld.com>
Wed, 10 Feb 2010 15:00:04 -0500

          From comp.compilers

Related articles
Infinite look ahead required by C++? ng2010@att.invalid (ng2010) (2010-02-05)
Re: Infinite look ahead required by C++? cfc@shell01.TheWorld.com (Chris F Clark) (2010-02-06)
Re: Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-06)
Re: Infinite look ahead required by C++? thurston@complang.org (Adrian Thurston) (2010-02-08)
Re: Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-09)
Re: Infinite look ahead required by C++? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-02-10)
Re: Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-10)
Re: Infinite look ahead required by C++? cfc@shell01.TheWorld.com (Chris F Clark) (2010-02-10)
Re: Infinite look ahead required by C++? martin@gkc.org.uk (Martin Ward) (2010-02-11)
Re: Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-13)
Re: Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-14)
Re: Infinite look ahead required by C++? wclodius@los-alamos.net (2010-02-13)
Re: Error reporting, was Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-14)
Re: Error reporting, was infinite look ahead max@gustavus.edu (Max Hailperin) (2010-02-14)
[18 later articles]
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: Wed, 10 Feb 2010 15:00:04 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 10-02-024 10-02-029
Keywords: C++, design
Posted-Date: 13 Feb 2010 11:27:17 EST

"Ira Baxter" <idbaxter@semdesigns.com> writes:


> I really don't like the phrase "hard to parse", because it's relative
> to your parsing technology. GLR parsers are capable of parsing C++
> easily in spite of the ambiguous grammar.


I think relative to C++ this is a dead issue. The problems (and
"solutions") to C++ are well-known. I think the number of people who
are making C++ a subset of their language is quickly approaching 0.


However, for language design (and compiler writing) I think there is
still a fundamental issue, and "hard to parse" while a bit misguided
does actually get at something important. The issue is not whether we
have a parsing technology that solves the problem. The issue is
whether we are designing an incomprehensible mess. An ambiguous
grammar is not an iron-clad sign that one has produced one--there are
ambiguous grammars that describe Pascal, which is syntactially quite
simple. However, I would take it as a strong indicator. If one needs
to invent (or use) a new parsing technology (GLR, PEGs, Boolean
Grammars, etc.) to describe one's language, then there are going to be
tools that don't have said technology and who will "hack" the result
in.


This issue occurs not just at the parsing level. Templates in C++ and
type-inference in FP languages seem to exhibit the same issue at the
semantic level. If you write valid code in them, life is good.
However, if you make an error, the resulting messages are often so
cryptic and useless that you are often better off, just looking for
your typo rather than trying to decipher what the message is
attempting to tell you.


And, to me that reflects back to the ambiguous grammar issue. If you
have a language that has an ambiguous grammar which requires a
sophisticated parsing technology to resolve, and some poor neophyte
user makes an error, what are the odds that the user will be presented
with a cryptic error message that doesn't make it clear that their
issue is that they forgot some punctuation n-tokens back and that we
have therefore encountered an error here using a bizarre and
unintended interpretation of the tokens? The fact that this sequence
is not a valid declarator-sans-implicit-filler-row-clause is
irrelevant to the user, since they weren't trying to write one of
those.


The fact that GLR can successfully resolve an ambiguous input fed to
an ambiguous grammar and produce the resulting parse forest, which can
them be pruned by semantics (e.g. unification) may make writing
compilers easier. However, it isn't clear that this will promote
better language design.


Perhaps, the ambiguity reports you listed later in your posting would
help. I don't know. I haven't tried using them. However, from what
you showed, it worries me that they may be treated the same way lint
outputs were used in C--they were used when the number of lint
warnings were small, but too often the number of noise errors on
things the person "knew" were ok overwhelmed the signal and people
stopped caring.


While I hope I'm not a Luddite, I think there is much to be said for
simplicity. A person who wants to learn language design should
concentrate on languages which are clearly and simply LL(1), i.e.
languages which have as the first token of each "phrase" a
disambiguating token. Note, that when you have that good error
messages are generally much easier as the disambiguating token tells
one what one is looking for. Each other part of the language should
strive for the same simplicity and clarity.


Just one person's opinion,
-Chris


******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.