What attributes of a programming language simplify its implementation?

Christopher F Clark <christopher.f.clark@compiler-resources.com>
Fri, 30 Sep 2022 12:46:28 +0100

          From comp.compilers

Related articles
What attributes of a programming language simplify its implementation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-09-30)
| List of all articles for this month |

From: Christopher F Clark <christopher.f.clark@compiler-resources.com>
Newsgroups: comp.compilers
Date: Fri, 30 Sep 2022 12:46:28 +0100
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="57283"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 30 Sep 2022 21:42:18 EDT

I answered this question on Quora, but I think it is relevant to this
community (and I know I'll get discussion as a result)..


What attributes of a programming language simplify its implementation.


      1. Simple semantics. That's it. Simple semantics. (Simple meaning
      whatever is easy to implement. Not mathematical elegance. Not
      consistency.)


How do you get there?


Have a very simple set of types. BASIC had numbers, strings, and arrays.
Don't worry about type conversions and floating point versus integer. Sweep
that all under the rug. Whatever your implementation does, that's what it
does. (Even simpler is what a lot of shells do, you have just "strings" and
if the strings happen to be a number when you pass them to the "add
function", + operator, it does arithmetic. If they aren't it, whatever it
does is the definition.)


Do an interpreter rather than a compiler. Don't try to get "efficient"
machine code. Just get code that works, for your simple cases. See the
paragraph above. Whatever your interpreter does, that's what it does.


Don't get fancy. The original C compilers were almost like BASIC, just
slightly more complex. And even though they were compilers not
interpreters. You got whatever code they generated. It just happened (well,
actually a lot of theory went into making it "just happen") to easily match
the machine/assembly language of the machines of that era. Even the stuff
that was added to C was often done so to keep the implementation simple.
Header files are a good example. They let you put together slightly more
complex programs, but they only work if the programmer uses them right. If
you have inconsistent conflicting header files, you get "undefined
behavior" a code word for "whatever the implementor decided to do".
Maybe (if you are lucky) you get an error, but maybe you get code that just
doesn't work.
------------------------------


But static typing. No. It doesn't help. Simplicity of implementation wants
you to throw away all those types. What static typing gives you is reliable
and well-defined programs, not a simple implementation.


Ahead of time compilation, same thing. Does not make the implementation
easier. It has other attributes but simplicity of implementation is not
necessarily one of them. (In some cases it can be simpler, but not always.
an interpreter is almost always simpler than any compiler for the same
amount of functionality.)
------------------------------


*Edit added:*


By the way, that's how many introductory Compiler classes are structured.
Take a language with a relatively simple language (C or Pascal are popular
choices, lisp dialects are even simpler) and then throw things out. One
type "int" which is a fixed width (e.g. 32 bit) signed integer, no
conversions. Allow only one function "main". Allow only one arithmetic
operation "add" (+). Allow only one comparison "equal" (==). If you are
generating code rather than doing an interpreter, pick the simplest
architecture you can (e.g. MIPS) and then only allow constants of 16 bits
so you don't need hi/lo. Now, you have a simple enough language that a
student can likely get it working in one semester (or even one quarter).


Believe it or not, that's actually how a lot of "real" compilers are
written. You do a "spike" that is pick one *exceptionally* simple case and
get it working end-to-end. Then, you build around that. If something looks,
hard, you do a new spike that makes that issue as simple as possible and
get that working.
------------------------------


Even C++ was built that way. It started with a working C compiler as a
base(*). Then Stroustrup added, feature by feature (probably using C
macros) the things he wanted to make it object-oriented, to make it "C with
classes". He didn't start with multiple-inheritance and templates and the
STL. You can even see the results of that in the design of C++.


I suspect the weird way that constructors take parameters as
ctor_name(arg1, arg2, arg3) comes from that. Ctors were probably initially
turned into macros and that was C's syntax for macros. The fact that it
makes certain declarations ambiguous wasn't noticed because in the "spike"
they worked as intended. The complexity of the other case (how you
sometimes can't tell a function declaration from a constructor call) was
ignored until later.


Similarly, the fact that you need to use "new" and "delete" instead of
"malloc" and "free". The same thing. In a spike that made it easy. Fixing
malloc and free to know when things had ctors and initializing them
properly would have been more work. Adding new functions that did so was
easier. Thus simplicity of implementation ruled and the complexity for
users was not factored in.


I could go on. Even later when C++ had a standards committee, things were
added one feature at a time. The STL didn't exist until after C++ has
templates. The move semantics rules were a patch to fix up a case where
things that were initially simple didn't do what users wanted. But again,
they were done as a "spike" add only one feature at a time. And sometimes,
one has to add new features or specifications to fix up the interaction of
the features which slowly acreted.


*) And starting with a C compiler as a base, gave Stroustrup a simple model
to start with. Writing C code is easier than writing assembly code, even
for a PDP-11. Again, simplify as much as possible to make one's
implementation easy.


Lots of "lisp" interpreters are written in lisp, because that's an easy way
to express lisp's semantics. You then have a small program written in lisp,
that you need to hand-implement. Once that program works, you bootstrap
your way up to the whole interpreter you want.


When we did a Jovial compiler at my first job, we started with PL/I macros
that gave us a subset of Jovial that we needed. We didn't worry about the
cases where the PL/I semantics weren't exactly the same as Jovial, we
weren't going to use those features anyway. Again, sweep any hard semantics
under the rug and don't worry about them. Make your implementation simple
and accept whatever semantics it gives you. Label anything that doesn't
work the way you want in your implementation, "undefined behavior".
------------------------------


By the way Richard P Gabriel famously wrote about this, coining the phrase
"Worse is better". Here <https://en.wikipedia.org/wiki/Worse_is_better>is a
link to a Wikipedia article derived from his ideas.
--
******************************************************************************


Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.