Re: Infinite look ahead required by C++?

"Ira Baxter" <idbaxter@semdesigns.com>
Sat, 13 Feb 2010 10:59:33 -0600

          From comp.compilers

Related articles
[3 earlier articles]
Re: Infinite look ahead required by C++? thurston@complang.org (Adrian Thurston) (2010-02-08)
Re: Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-09)
Re: Infinite look ahead required by C++? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-02-10)
Re: Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-10)
Re: Infinite look ahead required by C++? cfc@shell01.TheWorld.com (Chris F Clark) (2010-02-10)
Re: Infinite look ahead required by C++? martin@gkc.org.uk (Martin Ward) (2010-02-11)
Re: Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-13)
Re: Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-14)
Re: Infinite look ahead required by C++? wclodius@los-alamos.net (2010-02-13)
Re: Error reporting, was Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-14)
Re: Error reporting, was infinite look ahead max@gustavus.edu (Max Hailperin) (2010-02-14)
Re: Error reporting, was Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-15)
Re: Error reporting, was Infinite look ahead required by C++? haberg_20080406@math.su.se (Hans Aberg) (2010-02-16)
[16 later articles]
| List of all articles for this month |

From: "Ira Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: Sat, 13 Feb 2010 10:59:33 -0600
Organization: Compilers Central
References: 10-02-024 10-02-029 10-02-047
Keywords: C++, parse
Posted-Date: 13 Feb 2010 12:02:36 EST

"Chris F Clark" <cfc@shell01.TheWorld.com> wrote in message
> "Ira Baxter" <idbaxter@semdesigns.com> writes:
>
>> I really don't like the phrase "hard to parse", because it's relative
>> to your parsing technology. GLR parsers are capable of parsing C++
>> easily in spite of the ambiguous grammar.
>
> I think relative to C++ this is a dead issue. The problems (and
> "solutions") to C++ are well-known. I think the number of people who
> are making C++ a subset of their language is quickly approaching 0.


Agreed. But there are a lot of people that want to build tools
to process the languages they use daily (C, C++, COBOL, VB,
....), and there's a bull-headed insistence on trying to parse them using
weak parsing technologies. My point is that the parsing technology
exists to handle pretty much of all of these. "The right tool
for the right job."


> However, for language design (and compiler writing) I think there is
> still a fundamental issue, and "hard to parse" while a bit misguided
> does actually get at something important. The issue is not whether we
> have a parsing technology that solves the problem. The issue is
> whether we are designing an incomprehensible mess. An ambiguous
> grammar is not an iron-clad sign that one has produced one--there are
> ambiguous grammars that describe Pascal, which is syntactially quite
> simple. However, I would take it as a strong indicator. If one needs
> to invent (or use) a new parsing technology (GLR, PEGs, Boolean
> Grammars, etc.) to describe one's language, then there are going to be
> tools that don't have said technology and who will "hack" the result
> in.


Well, we don't have to invent these technologies. That's done.
They're even widely available. Bison offers GLR, for example.


> This issue occurs not just at the parsing level. Templates in C++ and
> type-inference in FP languages seem to exhibit the same issue at the
> semantic level. If you write valid code in them, life is good.
> However, if you make an error, the resulting messages are often so
> cryptic and useless that you are often better off, just looking for
> your typo rather than trying to decipher what the message is
> attempting to tell you.


Perhaps we need more serious effort put into automating the production
of good error reports. While there is some research there,
none of this appears to me to be widespread technology.


> And, to me that reflects back to the ambiguous grammar issue. If you
> have a language that has an ambiguous grammar which requires a
> sophisticated parsing technology to resolve, and some poor neophyte
> user makes an error, what are the odds that the user will be presented
> with a cryptic error message that doesn't make it clear that their
> issue is that they forgot some punctuation n-tokens back and that we
> have therefore encountered an error here using a bizarre and
> unintended interpretation of the tokens? The fact that this sequence
> is not a valid declarator-sans-implicit-filler-row-clause is
> irrelevant to the user, since they weren't trying to write one of
> those.


Perhaps. I'm conflicted by your use of "neophyte" and "type inference
in FP languages" in the same discussion. I think we have to be clear
where the problem lies: is it in the neophytes lack or shallowness of
understanding, or the crypticness of the tool even for the expert?
There's a component of both here. C++ is a very good weapon in an
expert's hands. And it makes neophytes flee to the perceived
simplicity of PHP.


> The fact that GLR can successfully resolve an ambiguous input fed to
> an ambiguous grammar and produce the resulting parse forest, which can
> them be pruned by semantics (e.g. unification) may make writing
> compilers easier. However, it isn't clear that this will promote
> better language design.


> Perhaps, the ambiguity reports you listed later in your posting would
> help. I don't know. I haven't tried using them. However, from what
> you showed, it worries me that they may be treated the same way lint
> outputs were used in C--they were used when the number of lint
> warnings were small, but too often the number of noise errors on
> things the person "knew" were ok overwhelmed the signal and people
> stopped caring.


Point to you. We don't care; we don't run that report very often,
because our tools are able to handle the ambiguities with relative
grace. (Of course, we pay a heavy price in having to encode all the
rules that resolve the ambiguities, and there I'm just as unhappy [or
more so, having had to actually do it] than you).


Regardless, we're not going to change those langauges; that's up to
programming language committees (hey, Java 10 committee, are you
listening?) or the recent rash of scripting langauges from single
point sources (PHP, Ruby, ...)


I'd prefer simpler languages, too. But I'm not sure I want to constrain
their shape to match the simplest parsing technology I can find.


-- IDB



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.