Re: 'conservative' GC == 'risky' GC (Hans Boehm)
Fri, 27 May 1994 18:14:22 GMT

          From comp.compilers

Related articles
'conservative' GC == 'risky' GC (1994-05-21)
Re: 'conservative' GC == 'risky' GC (1994-05-22)
Re: 'conservative' GC == 'risky' GC (1994-05-23)
Re: 'conservative' GC == 'risky' GC (1994-05-24)
Re: 'conservative' GC == 'risky' GC (1994-05-25)
Re: 'conservative' GC == 'risky' GC (1994-05-26)
Re: 'conservative' GC == 'risky' GC (1994-05-27)
Re: 'conservative' GC == 'risky' GC (1994-05-27)
Re: 'conservative' GC == 'risky' GC chase@Think.COM (1994-05-26)
Re: 'conservative' GC == 'risky' GC (1994-05-31)
Re: 'conservative' GC == 'risky' GC (1994-05-30)
Re: 'conservative' GC == 'risky' GC (1994-05-31)
Re: 'conservative' GC == 'risky' GC (1994-06-07)
Re: 'conservative' GC == 'risky' GC (1994-06-13)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Hans Boehm)
Keywords: GC
Organization: Xerox Palo Alto Research Center
References: 94-05-084 94-05-116
Date: Fri, 27 May 1994 18:14:22 GMT (Greg Morrisett) wrote:
> Actually, it's just as much of a misnomer to call a copying or mark-sweep
> or tracing garbage collector "accurate" or "non-convservative". It is
> generally uncomputable whether an object is garbage (i.e. is going to be
> accessed in the future or not.) We've grown accustomed to using
> "pointer-reachability" as an approximation as to what must be preserved,
> but this is only a _conservative_ approximation. (Mark Tillotson) writes:
>If you have a debugger and value-inspector in your system, then
>pointer-reachability is what the user wants and expects, because the user
>view is of a graph of nodes, not purely the semantics of the original

Agreed, but the details don't seem to work out quite right. The garbage
collector looks at pointer-reachability in the data structure maintained
by the compiler-generated code and run-time system. This is unlikely to
be what the programmer has in mind. Given that no language definition,
and few implementations, define pointer reachability at the source level,
it's not quite clear what the programmer should have in mind.

2 examples of how things get sticky:

1. Consider a language that allows general function closures. Assume that
f has local variables x and y and a local function g. Function g
references x but not y. I'm debugging inside a call to g. A good
debugger would probably let me look at x. Most ML and Scheme
implementations would keep x's binding in something like g's activation
record. However most would not keep the appropriate binding of y
anywhere. But I can probably still name it in the debugger. (Many ML and
Scheme programmers would be upset if y's binding were still guaranteed to
be kept around, I think. That would be considered space inefficient. But
others will be surprised at not being able to get its value from the

2. Consider a compiler that understands bignum arithmetic and understands
that factorial is a pure function. Suppose I write

x := log(fact(100000)); g(...); y := fact(100000); ...

The compiler might eliminate the second call to fact as a common
subexpression. But now fact(100000) is reachable during the call to g,
and the garbage collector's notion of pointer reachability no longer
reflects what the the programmer is likely to have had in mind. (And
indeed if this code happened to occur inside g, this optimization might
have disastrous consequences.) Is this a legal optimization? If not, can
I eliminate (x+1) as a common subexpression, if the result may have to be
heap allocated?

SML of NJ attempts to get this at least asymptotically right, for their
definition of (source-level) pointer reachability. But doing so isn't
free. And they have lots of war stories.

My impression is that this gets much harder still in Prolog. Some Prolog
garbage collectors actually do better than (machine level) pointer

Hans-J. Boehm

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.