Re: Compilers for systems programming (was: A C style compiler)

eric dahlman <dahlman@cs.colostate.edu>
7 May 1998 16:55:28 -0400

          From comp.compilers

Related articles
Compilers for systems programming (was: A C style compiler) rideau@ens.fr (Francois-Rene Rideau) (1998-05-04)
Re: Compilers for systems programming (was: A C style compiler) dahlman@cs.colostate.edu (eric dahlman) (1998-05-07)
Re: Compilers for systems programming (was: A C style compiler) ok@atlas.otago.ac.nz (Dr Richard A. O'Keefe) (1998-05-07)
Re: Compilers for systems programming (was: A C style compiler) dwight@pentasoft.com (1998-05-12)
Re: Compilers for systems programming (was: A C style compiler) eeide@cs.utah.edu (Eric Eide) (1998-05-12)
Re: Compilers for systems programming (was: A C style compiler) will@ccs.neu.edu (William D Clinger) (1998-05-12)
Re: Compilers for systems programming (was: A C style compiler) ct7@mitre.org (W. Craig Trader) (1998-05-15)
Re: Compilers for systems programming (was: A C style compiler) sperber@informatik.uni-tuebingen.de (1998-05-15)
[2 later articles]
| List of all articles for this month |

From: eric dahlman <dahlman@cs.colostate.edu>
Newsgroups: comp.compilers
Date: 7 May 1998 16:55:28 -0400
Organization: Colorado State University, Fort Collins, CO 80523
References: 98-05-017
Keywords: practice

Francois-Rene Rideau <rideau@ens.fr> writes:


> To revert to more on-topic tracks, I wonder if there are other
> language besides C and cousins (Pascal, Ada, Modula-3, Oberon), whose
> design/implementation combos were fit for programming systems
> internals.


Another example is ML, I don't have the reference handy but as I
remember it the SML of NJ folks have a ML based OS. It was created by
"porting" the ML runtime to the Flux OS toolkit out of the University
of Utah. As I remember a Java OS was similarly created and even an
EmacsOS was proposed but I don't know if it was completed. At any
rate elisp and Java are bytecompiled so they may not address you low
level compiler concerns.


> As an important example, what language with automatic memory
> management could be used to *fully* implement its own (efficient) GC
> without hiding lots of details in custom hardware, or arbitrary
> software implementation choices?
>
> I know for instance, that LISP Machines were fully programmed in LISP,
> down to the "assembly". How did they handle the situation?


Well the Ivory processor in my Symbolics uses some fancy hardware to
handle the problem of efficient GC. But that is not really necessary
as far as I can tell. The only real problem to having the GC written
in the language it is GCing is how does it satisfy its own memory
requests. This is only a minor difficulty once the inherent
circularity is broken, for that matter the GC may not need to allocate
any memory for its own use thereby rendering this issue moot. In
practice GCs are written in some lower language than that which they
are collecting some of the reasons for this are:


      * GCs are generally small (a few of pages of code) so the benefits
          of implementing them in a higher level language would not be very
          great.


      * Because of this small size it is reasonable to invest the effort
          in hand optimizing them to death to get as much performance as
          possible. This fine tuning may not be easily accessible in the
          higher level language.


      * GCs need to be there while the rest of the system is being
          developed, therefore, it is simpler to bootstrap the system with
          a GC in another language.


      * Others I am sure I left out.




> It seems to me they did rely on many details and implementation choices
> provided by custom hardware. How far can the approach be recycled
> for common hardware architectures?
> [Hum. Looks like time for me to read and grok CMUCL sources. -- Faré]
> [And what if my first name was John? -- Faré]


Garbage collectors can really benefit from the different types of
hardware support available in the system, and this support need not be
as fancy as that found in a Lisp machine to be of great benefit. For
instance the GC can make use of features of standard hardware like:


      * Knowing which pages have been dirtied since the last GC.


      * Knowing which pages have been swapped out.


      * The ability to trap reads and writes to certain pages.


However, these facilities are not always available because the
hardware may not support them or the OS isn't cooperative. The impact
of some of the above features can be so great that if you want to
have a really efficient GC I think that you would need to match it to
the hardware available. Now it may be possible to implement a general
"Parameterized GC" which would be adaptable to a range of machines,
but then you are only fudging on the performance/specificity trade
offs.


>
> It looks like not imposing such arbitrary hardwired decisions calls
> for a "reflective architecture". Are there any language/compilers
> developped around such a reflective architecture, allowing for
> arbitrary reimplementations by the user, without any choice being
> cast-in-iron between the language semantics and the actual hardware?


The is a place for the ability to control what type of garbage
collector is being used in a given system. Many systems provide
several parameters for tuning the behavior of the provided collector
but I can only think of one that allows the user the ability to
specify their own GC, and that is Ada 95.


Ada 95 kind of sits on the proverbial GC fence in that the language is
specified as being optionally garbage collected. This is primarily to
satisfy the real time crowd in the Ada camp (note that there are also
real time GCs in existence). In reality few implementations provide
garbage collection and most people do not program like it exists.
They do this by using something called an unchecked_deallocation (I
may be wrong on the function's name it has been a while since I used
Ada) which is like the familiar function free(). So where does a user
GC fit into all of this? The Ada folks realized that different types
of data may have different use patterns and could possibly benefit
from different memory management strategies so Ada 95 has the concept
of a pool. A pool is a memory object under user control which manages
memory requests like malloc's and free's. It may also GC itself and
do all the other good things you may want. Using pools it would be
possible to have a prolog system that allocated memory from the
"prolog pool" and a lisp system that allocated memory from the "lisp
pool" and each pool would then be GCed by a collector tuned to the use
patterns of each system. And the user could make a new "my pool" that
would be GCed in yet another way.


> Are any of these free software, or otherwise with available sources
> and/or good descriptions?


There is an Ada 95 front end to GCC called GNAT (not GNATS) that is
freely available. There should be a pointer to it on prep.ai.mit.edu
somewhere.


-Eric
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.