Re: Best Ref-counting algorithms?

George Neuner <>
Thu, 16 Jul 2009 12:35:26 -0400

          From comp.compilers

Related articles
[14 earlier articles]
Re: Best Ref-counting algorithms? (BGB / cr88192) (2009-07-15)
Re: Best Ref-counting algorithms? (BGB / cr88192) (2009-07-15)
Re: Best Ref-counting algorithms? (BGB / cr88192) (2009-07-15)
Re: Best Ref-counting algorithms? (Gene) (2009-07-15)
Re: Best Ref-counting algorithms? (2009-07-16)
Re: Best Ref-counting algorithms? (BartC) (2009-07-16)
Re: Best Ref-counting algorithms? (George Neuner) (2009-07-16)
Re: Best Ref-counting algorithms? (Hans Aberg) (2009-07-17)
Re: Best Ref-counting algorithms? (Hans Aberg) (2009-07-17)
Re: Best Ref-counting algorithms? (Larry) (2009-07-17)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-17)
Re: Best Ref-counting algorithms? (glen herrmannsfeldt) (2009-07-17)
Re: Best Ref-counting algorithms? (George Neuner) (2009-07-17)
[13 later articles]
| List of all articles for this month |

From: George Neuner <>
Newsgroups: comp.compilers
Date: Thu, 16 Jul 2009 12:35:26 -0400
Organization: A noiseless patient Spider
References: 09-07-018 09-07-039 09-07-043
Keywords: GC
Posted-Date: 16 Jul 2009 18:36:52 EDT

On Wed, 15 Jul 2009 00:05:12 -0700 (PDT), Christoffer Lernv
<> wrote:

>On Jul 14, 8:27 pm, George Neuner <> wrote:
>> On Sun, 12 Jul 2009 13:41:59 -0700 (PDT), Christoffer Lernv
>> >If one would like to implement a fast GC using refcounting, should I
>> >implement something like in those papers, or are there better
>> >algorithms out there?
>> I haven't read Bacon's paper, but you'll note that Lins uses a marking
>> scan as a backup to identify and break data cycles for the reference
>> counter. Although the marking scan need not be run unless memory is
>> tight, it significantly complicates the overall implementation.
>I implemented the simple non-concurrent version of the algorithm as
>described in Bacon's paper and I did not find it all that complex, the
>concurrent version is naturally more involved though.

Marking is certainly not difficult - not even when done concurrently,
but tuning it for good performance is tricky. In particular, how you
handle the queue - as a stack or as a FIFO - and how you handle out of
space conditions on the queue are critical.

>... what I find interesting is that the scanning in the case of
>ref-counting is on a small subset of all objects. For more advanced
>tracing collectors, this is also the case. There is a nice paper on
>the duality between ref-counting and tracing collectors by Bacon,
>Cheng and Rajan.

You can only safely restrict scanning if you can segregate objects
that are shared (or could be). Most programming languages don't allow
for that distinction up front, so the compiler or runtime environment
has to handle it behind the scenes.

I previously mentioned 1-bit counting which is one method of handling
the segregated sharing issue. Copying a pointer requires a check of
the object to see whether it is unique or shared. You can freely copy
a pointer to a shared object, but if the object is not shared you
first move it to a separate "shared object" heap, mark it shared
(obviously), and save the new address to both the source and
destination of the pointer copy. Then you can free the original
unshared object.

>Bacon also claims that their collector got really good results when it
>comes to GC cost, quite competitive with tracing collectors.

I have no doubt in Bacon's claims for the applications and conditions
under which he tested.

All collectors have their own "sweet spot" of environment conditions
under which they work very well ... reference counters are no

For general purpose programming, most studies have shown that tracing
or copying collectors have lower average performance impact on the
mutator(s). Reference counters are known to work better in
constrained memory situations and are pretty much the only effective
way to implement distributed collectors where it's impossible to use
pointers due to disjoint memories (though better ways are being
actively researched).

>I guess I simply have to implement a few different collectors (tracing
>and ref-counted) and see which type I like best.

If you're after very high performance (a hobby of mine due to a stint
in real time programming), you should look into Baker's treadmill
collector. Treadmills are expensive space-wise, but they have fixed
processing costs and worst case behavior that is completely
predictable (a critical requirement for RT).

Every GC design has weak spots and most have worst case performance
that is extremely bad (treadmills are an exception to bad worst case
performance, but their weak spot is fixed size allocation blocks).

The trick is to anticipate the use - which is all but impossible if
the GC is intended to support general programming - and design for
either the best average case or the worst possible case. Most GC
systems today are hybrids designed for the best average case.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.