|GC question email@example.com (Paul Rubin) (2007-07-27)|
|Re: GC question firstname.lastname@example.org (Ulf Wiger) (2007-07-27)|
|Re: GC question email@example.com (Gene) (2007-07-28)|
|Re: GC question firstname.lastname@example.org (Ray Dillinger) (2007-07-28)|
|Re: GC question @iecc.com <email@example.com (Paul@iecc.com, Rubin) (2007-07-30)|
|Re: GC question firstname.lastname@example.org (2007-08-13)|
|From:||Ulf Wiger <email@example.com>|
|Date:||Fri, 27 Jul 2007 16:30:53 +0200|
|Posted-Date:||27 Jul 2007 11:27:21 EDT|
>>>>> "PR" == Paul Rubin <firstname.lastname@example.org> writes:
PR> Suppose you build a big list of cons cells, say a billion of
PR> them (you're on a large machine). This is in a runtime with
PR> traditional marking or copying gc, no generation scavenging or
PR> region inference or anything like that. The collector runs
PR> every N bytes of allocation for some fixed N. Yes I know that's
PR> a dumb way to write an allocator for a big-system implementation
PR> but it's fairly common for small ones.
PR> It seems to me that the running time to allocate N cells is
PR> O(N**2) because you run the collector O(N) times during the
PR> allocation, and each collection costs O(N) on average.
PR> I never realized this before. Is it a well-known phenemonon?
Where I come from (industrial use of Erlang), we have design rules
warning about this very thing.
PR> Is the main answer something like generation scavenging?
<disclaimer>I have only superficial knowledge of GC techniques</..>
Not necessarily. It can actually make things worse.
If you, say, enter a loop that constructs a list one element
at a time, the garbage collector will have no way to tell
when you're going to stop. It will run out of heap, scan,
tenure data, etc., but as the heap just keeps growing, it's
not going to be able to reclaim much, if anything.
There may be different ways of going generational GC, but at
least in Erlang, the collector will keep an old heap and a new
heap, and revert to fullsweep GC if the generational scan fails
to reclaim enough data. If the fullsweep fails, it will resize
the heap. With constantly growing heap, this is likely to occur
at every GC stop. Extremely annoying.
A workaround specific to Erlang is that you can spawn a process
with a predefined minimum heap size - preferrably large enough
that GC will never kick in. It can build the data, then send it
in a big whopping message to the process that needs it. It
will be costly to copy the message, but not nearly as much as
paying for all those GC stops (which also copy).
Most of the time, you try to come up with an implementation that
doesn't build huge data structures. Again in Erlang, this is
usually achieved by exploiting the concurrency patterns in the
problem and dispatching data in small chunks to other processes.
Some seasoned GC expert will perhaps give you a good solution
to the problem. I can only tell you that you're not alone. (:
Ulf Wiger, Senior Specialist,
/ / / Architecture & Design of Carrier-Class Software
/ / / Team Leader, Software Characteristics
/ / / Ericsson AB, IMS Gateways
Return to the
Search the comp.compilers archives again.