Re: Definable operators

Craig Burley <burley@tweedledumb.cygnus.com>
8 May 1997 21:37:36 -0400

From comp.compilers

Related articles
[28 earlier articles]
Re: Definable operators hrubin@stat.purdue.edu (1997-04-30)
Re: Definable operators apardon@rc4.vub.ac.be (1997-05-04)
Re: Definable operators Dave@occl-cam.demon.co.uk (Dave Lloyd) (1997-05-04)
Re: Definable operators ephram@ear.Psych.Berkeley.EDU (Ephram Cohen) (1997-05-06)
Re: Definable operators rideau@ens.fr (Francois-Rene Rideau) (1997-05-08)
Re: Definable operators monnier+/news/comp/compilers@tequila.cs.yale.edu (Stefan Monnier) (1997-05-08)
*Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley)* (1997-05-08)**
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-08)
Re: Definable operators Dave@occl-cam.demon.co.uk (Dave Lloyd) (1997-05-12)
Re: Definable operators mfinney@lynchburg.net (1997-05-12)
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-13)
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-13)
Re: Definable operators pjj@cs.man.ac.uk (1997-05-14)
[4 later articles]

| List of all articles for this month |

From:	Craig Burley <burley@tweedledumb.cygnus.com>
Newsgroups:	comp.compilers
Date:	8 May 1997 21:37:36 -0400
Organization:	Cygnus Support
References:	97-03-037 97-03-076 97-03-112 97-03-115 97-03-141 97-03-162 97-03-184 97-04-027 97-04-095 97-04-113 97-04-130 97-04-164 97-05-053
Keywords:	syntax, design

Dave Lloyd <Dave@occl-cam.demon.co.uk> writes:

> Craig Burley (burley@tweedledumb.cygnus.com) wrote:
> {much omitted}
> As to your thesis against overloading, I believe you to be deeply
> mistaken. Your use of C and Fortran 77 as defenses is particularly
> risible: both suffer from the awful problem that I can override SIN
> etc., at link-time. Any abstraction can be used to hide both
> irrelevant implementation and significant semantics. It is the
> craftsman's job to get the balance right as it is with English.
> Elegance is the goal not some kind of entropic minimalism. A good and
> powerful tool can help you create works that you would otherwise be
> incapable of, but it can also take you straight to Hospital. These
> things cannot be legislated, they can barely be taught, but they can
> be learnt and are often but matters of personal or cultural taste.
>
> But I can see that we can never agree on this one.

Exactly _where_ do we disagree? It's hard for me to understand how I
can be "deeply mistaken" because Fortran and C allow overriding of SIN
(though not at link time in decent Fortran systems -- I'm unaware of
any _indecent_ systems, btw). Are you saying that, because Fortran
allows overriding of SIN, I must therefore celebrate any and all
instances of operator overloading, regardless of how far afield the
original purpose and meaning of the operator such instances take it?

If you'd been reading my earlier posts (which I admit could take
awhile), I in fact _did_ attend to such "awful problems". One problem
I pointed out, a case of _relative_ lowering of linguistic usefulness,
was Fortran 90's CONTAINS, which changed the way one determines the
meaning of

R = SIN(S)

which, in FORTRAN 77, was "invokes SIN intrinsic iff no _preceding_
statement in the program unit declares, explicitly or implicitly,
SIN as naming something other to the intrinsic".

In Fortran 90, the determination requires more work, so the language
is weaker in this one area -- "invokes SIN intrinsic iff no _preceding
or following_ statement in the program unit declares...".

That is, in FORTRAN 77, you needed only "look upwards" in the code to
determine what R = SIN(S) meant. In Fortran 90, you _also_ must "look
downwards" in case "CONTAINS" followed by "FUNCTION SIN" occurs.

My point has never been the F77 is "perfect", rather, by illustrating
the specific way in which F90 made Fortran worse _linguistically
speaking_ in this one area, I have tried to illustrate the more
general principles that would also be easily used to explain things I
think we all know were "bad juju" for a long time. (Otherwise, why do
so many people claim Fortran is a "bad language"?)

I didn't get into all sorts of other examples of linguistic problems
that various languages (such as C and Fortran) have, mostly because
there's no need to -- the context of the discussion is overloading.
I've been addressing (and attempting to strongly refute) the notion
that totally flexible overloading as a facility is _always_ a "good
thing".

I don't believe I've _ever_ said that overloading is always a _bad_
thing, or that it must necessarily be restricted by language
processors (such as compilers), or even that any language that offers
overloading must necessarily mandate _particular_ restrictions on its
use. I have said such mandated restrictions tend to _increase_ the
semantic content of expressions in that language, and any such
increase is, taken in isolation, a good thing for the language (though
of course the whole picture must be taken into account).

If you doubt this, decide whether it would be better if C++ was
"improved" to be even more "flexible" in allowing
overloading/augmenting such that "a = b + c;" would also mean "and,
whenever this statement is executed, increment the value of whatever
variable appears on the _next_ line in the source code". Does that
sound like a useful _improvement_ in the language? Yet, as a
programmer, I would find such a general facility quite useful,
e.g. overloading the preprocessing activity so that preprocess-time
and parse-time optimizations could be done to, for example, reduce the
number of heap allocations, eliminate unused variables, add
profiling/debugging information at a higher level than existing tools
usually provide, etc.

I have no doubt such an improvement would make C++ an "even better"
tool. I also have no doubt it would make it an "even worse" language.

(As an aside -- I find it _quite_ fascinating that, in these
"disagreements" over my statements about language, every single
"disagreer" has completely missed one or two of what I believe have
been fairly clear explanations on my part. If the few of us who care
about language design and read comp.compilers can't even comprehend
well-written prose in English, why do we think we can design computer
languages that will be reliably and effectively used by millions of
computer programmers around the world?? For my own part, I've
actually made _serious_ investments in improving my communications
skills so I could learn something about what it takes to communicate
effectively, and I believe I do a pretty good job of it. Yet, I don't
believe I'm good enough to design a new language "from scratch" that
gets everything, or even enough, "right". I do know enough to be
quite sure when I see _bad_ language design done, i.e. I have enough
to know when to make effective use of whatever "veto power" I have,
and C++ is an excellent example of making an existing language (C)
much worse -- even as it makes an existing tool (cc) much more
powerful. Fortran 90 is less of an example over FORTRAN 77, so I tend
to pick on it less here, but I've picked on it often enough on
comp.lang.fortran.)

> Rather than spend effort in restraining the programmer in the language
> I would rather invest some effort in defining a lexicon of terminology
> for user-defined operators, functions, etc. Whether this be formalised
> by an international standard or maintained as a dialect within a
> project is perhaps less important but clearly as with English the
> wider some vocabulary is understood, the greater its
> applicability. Many software houses already maintain Coding Standards
> including naming conventions and if peer code reviews are regularly
> performed humans can enforce these standards with appropriate
> flexibility.
>
> Incidentally such a lexicon might reduce the anglocentric nature of
> programming you espouse. Terms in the lexicon could easily be given
> their correspondents in other natural languages providing a leg up for
> automatic translation of (say) French software into Norse software (ok
> it wouldn't be perfect but surely a great improvement).

I'm in favor of much of that, but until we all agree that a language
that allows

a = b + c /* whether a semicolon follows or not ;-) */

to be redefined such that b or c can be modified, a be referenced
before its modified, and so on is thus _worse_ as a _language_, and
that any code that actually performs such redefinitions is "wrong on
the face of it", we won't have achieved much. The people who claim
the above _could_ mean "concatenate the strings in b and c and store
the result in a" makes anything we do with such a lexicon useless --
because they feel they should be perfectly able (and congratulated
for) redefining existing, widely understood symbols and names to mean
new things simply for their convenience. Thus, they don't even bother
to ask form, much less demand, a language that gives them a _natural_
way to express what they want (concatenation), since they're already
patting themselves on the back for figuring out all the intricacies of
overloading enough to accomplish what they want using Someone Else's
Notation.

This gets to the crux of what I'm concerned about, but haven't really
said yet.

Essentially, we _believe_ (or we claim) we can make good languages by
precisely defining rules that, however arcane, "good programmers" will
follow to the letter, get exactly right all the time, or at least be
assured that any error that is a product of the normal (for humans)
error rate will be caught at compile or link time.

Then, we get into huge discussions here, on comp.arch, in
comp.lang.fortran, or in comp.lang.c because we find more and more
code that relies on "hidden", but often logical, assumptions made by
the programmer -- often as a sort of mnemonic device to substitute for
rote memorization of all the arcane rules we've set up for him.

(For example, assuming "int i;" on a 32-bit machine declares a 32-bit
two's-complement value insofar as it will have the expected "wrapping
behavior", and writing code to depend on that.)

The response of some of us is to yell at the programmer who wrote the
code. In effect, we yell "you shouldn't depend on that", and in
response to the comeback "but I needed that feature, and that code
worked for me", we yell "if you _need_ a behavior not provided
explicitly by the language constructs you are using (regardless of
whether they happen to work on one machine/compiler/OS combination),
you must _explicitly_ specify that behavior!!".

So far, so good, or at least some of us seem to think that way, even
when the programmer yells back "but you gave me _no way_ to explicitly
specify that I needed that behavior, such as `int32 i;'!!".

Here is where I have a _big_ problem with things like operator
overloading and other "kitchen-sink" features good programmers (_not_
good language designers) create to give people escape hatches for
their languages, ...

...because the response some of us provide is often "well, if you want
_that_, you create a new class, y'see, and you overload operators,
etc.".

In other words, _because_ we have the crutch of arbitrarily
overloadable infix operators, we tell people to get it out and use it
anytime they need to express something the _language_ doesn't let them
express.

In fact, what we're doing is telling them to spend lots of time,
effort, and bug-creating energy using a facility that is a poor choice
for many such uses, just because it happens to serve as a flexible
tool for implementation of a particular solution to their problem,
even while it serves as a very poor means to express their problem in
the first place.

Instead of working so hard to make operator overloading meet _every
possible_ need out there, we should be _thinking_ hard about how to
design languages where it is _easy, natural, and robust_ for
programmers to express common things like "I want an exactly-32-bit
integer".

But the crutches we give people are just "too sexy". When they ask
for means of _expression_ (and sometimes it is hard to know that's
what they're asking for), we offer them _implementations_ and the
ad-hoc notation we've invented to call upon them, ignoring the fact
that those notations are _not_ solutions to the problems of
_expression_.

And, one of the worst things we do in this area, is sacrifice
linguistic usability (e.g. we reduce semantic content of expressions)
simply to avoid a little typing! That is, we decide we don't want to
actually type out a more clear, detailed, immediately comprehendable
expression that says what we want, instead opting for some compact,
ad-hoc notation that we think "means the same thing", when in fact all
it really does is "implement the thing in a compatible way".

And we keep encouraging language designers to make it easier for us to
do this kind of thing using their languages.

That is Wrong. ;-)

Yet, we keep telling ourselves that the "unwashed masses" of
programmers will have clever editors and other tools that will help
them discover what the code we write "is really saying", even though
the reality is they often don't, and even if they did, those tools
won't help them read a program listing on a jet about to take off or
land.

The reality is, really good programmers who want to minimize typing
and use their own personal ad-hoc notations to represent expressions
that otherwise would be "lengthy" use their _own_ tools to make those
things happen, leaving the "lengthier", clearer expressions in the
program source _as seen by other people_.

It seems we have very few such programmers, however. Lots of people
who _think_ they're really good (and justify it based, partly, on how
good they are at memorizing and navigating unnecessary, arcane
language rules, like C's precedence table and C++ as a whole) are
simply shifting the burden of comprehension on everyone else -- and
are probably doing a worse job of solving the problems they're
actually called upon to solve.

Thankfully, there are some really good programmers who aren't so
interested in minimizing keystrokes, for themselves or others on their
project, and they are willing (and able) to impose "rules" on what can
and cannot be done in the language they're using (e.g. C++),
resulting in a per-project (or per-shop) language subset that, at
least, people well-acquainted with that subset can better deal with.

FORTRAN has needed such programmers (and effective language subsets)
for decades. That even "modern" languages like C++ need it shows how
poor they are as _languages_ (again, not as tools) by comparison.
That is, when you think in terms of "how much of this language should
I, and others, avoid if our goal is to write clear, maintainable
code", there seems to be an _increase_, not a decrease, in that
percentage for some modern languages over their predecessors.

Good languages would see a decrease in this percentage, in my opinion,
and one of the culprits that prevents this is an "everything-goes"
attitude towards operator overloading.

Until we get away from seeing technical stupid-pet-trick stuff like
operator overloading as our salvation, we won't focus (as an industry)
on solving the real problems, which involve human factors engineering,
linguistic ergonomics -- basically, giving people languages that let
them say what they _know_ about their programming problem, then
_separately_ specify, where necessary, how to translate those
expressions into solutions (such as implementations). Operator
overloading is one of several serious rat-holes we've gone down while
bumbling towards the goal. --

James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Definable operators

Craig Burley <burley@tweedledumb.cygnus.com>8 May 1997 21:37:36 -0400

Craig Burley <burley@tweedledumb.cygnus.com>
8 May 1997 21:37:36 -0400