[MLton-devel] Stop-and-copy / mark-compact switch criterion
Alain Deutsch
deutsch@polyspace.com
Fri, 16 Aug 2002 17:45:02 +0200 (MET DST)
Stephen,
A follow up on the GC switch criterion discussion. I agree that
contrary to what I initially suggested, the criterion proposed by
you seems preferable as it certainly avoids paging.
However what I was trying to do was to address a behaviour that I
found surprizing: during a self compile with gc-messages on on a
1Gb machine, there are first stop-and-copy GCs (first third on the
run), followed by a transcient allocation peak which causes a
switch to mark-compact. But then even when live/heap decreases
under 9% the GC still chooses mark-compact. This seems suboptimal
to me.
I think I found a better way to improve the switch criterion: in
fact the precursor to the S&C vs. M&C decision lies in
computeSemiSize(...). This routine nominally assigns a desired
heap size of LIVE_RATIO*live ( = 8*live).
I am not sure this is always optimal as I think that this code
indirectly forces the GC mode to mark-compact as soon as heap/live
<= 9 (where heap = size of RAM * ramslop). Assume heap/live = 9.
This routine will select a fromSpace size of 8/9*heap at the end
of the GC cycle. The mutator runs. New GC, assuming live has not
changed, we meet the criterion for switing to mark-compact as
fromSize = 8/9*heap, which if we add live = 1/9*heap yields
heapsize, thus causing a switch to mark-compact.
Intuitively, if the heap/live ratio is for instance of 5, it seems
preferable to use stop-and-copy with a smaller fromSpace (small
enough to ensure that fromSize+live fits the heap) instead of
using a fromSpace that occupies the entire heap (which forces us
to use mark-compact) as the former is presumably much cheaper that
the later. In the first case we will GC slightly more often with a
cheaper GC.
Can we formalize this ? I propose the following low-cost
modelisation:
------------------------------------------------------------------
Summary: a proposal for an improved criterion for choosing
semispace sizes that can decrease GC time by favoring
stop-and-copy without paging more.
Definitions:
m total bytes allocated by mutator
heap ram size * ramslop
live live data set in bytes
E cost(mark-compact)/cost(stop-and-copy) on the same heap
k heap/live
Assumption: heap>=2*live (we do not consider here paging cases for
which is is clear that mark-compact collection is faster).
1) after one stop-and-copy gc, there are heap-2*live bytes
available for the mutator (assuming the entire heap is used).
2) after one mark-compact gc, there are heap-live bytes available.
3) if we use only mark-compact, there will be about:
ngc(m&c) = m/(heap-live)
gcs for the entire computation. Similarly:
ngc(s&c) = m/(heap-2*live)
The relative efficiency of mark-compact is thus:
R=(ngc(m&c)*E)/ngc(s&c)
Elementary calculations yield:
R=(m*E/(heap-live))/(m/(heap-2*live))
=(E/(heap-live))/(1/(heap-2*live))
=E*(heap-2*live)/(heap-live)
as heap=live*k:
R=E*(k*live-2*live)/(live*k-live)
R=E*(k-2)/(k-1)
Numerical validation:
Let E=3 (plausible). We have the following:
k R
------------
10 2.66
5 2.25
4 2
3 1.5
2.6 1.12
2.5 1.0
2.4 0.86
2.1 0.27
For instance, this model predicts that if the cost of a single
mark-compact collection is three times the cost of a single
stop-and-copy (E=3), then if the live ratio k = 4 then using
mark-compact will globally cost 2 times the cost of using
stop-and-copy. If k<=2.5 then this model predicts that
mark-compact will yield better performances than stop-and-copy.
End of numerical validation.
We now want to find analycally for which value of k we have R=1,
as this will be the threshold. Elementary calculation yield:
R=1
<=> E*(k-2)/(k-1)=1
<=> k=(2*E-1)/(E-1)
Tabulating this function k of E yields the following table which
under the hypotheses above gives for each value of the relative
efficiency factor E the live ratio k up to which it is faster to
use stop-and-copy.
E k
------------
1.5 4.0
2 3.0
2.5 2.66
3 2.5
3.5 2.4
4 2.3
For instance, if an individual mark-compact GC cost 3 times a
stop-and-copy GC, then for live sets of up to heap/2.5 bytes it is
preferable to use stop-and-copy.
This confirms the intuition that unless the live set is fairly
large, then stop-and-copy is better.
As regards the implementation of this idea, the function computeSemiSize in gc.c (version 1.60):
static W32 computeSemiSize (GC_state s, W64 live) {
W32 res;
res = min64 (s->totalRam + s->totalSwap,
max64 (live * LIVE_RATIO_MIN,
min64 (s->ramSlop * s->totalRam,
live * LIVE_RATIO)));
res = roundPage (s, res);
...
return res;
}
The idea is to choose at most for res (in nominal cases):
1) heap-live (that is ramSlop*totalRam) if we want stop-and-copy to be possible on the next GC cycle
2) heap if we favor mark-compact.
This yields the following code:
static W32 computeSemiSize (GC_state s, W64 live) {
W32 res;
double ratio, heap, cutoff, relative_cost_of_mark_compact_collection;
relative_cost_of_mark_compact_collection = 3; // the E term. Can be adjusted
cutoff = (2*relative_cost_of_mark_compact_collection-1)/(relative_cost_of_mark_compact_collection-1);
heap = (s->ramSlop * s->totalRam);
if (heap/live >= cutoff) /* computed analytically assuming that a mark-compact GC costs
RELATIVE_COST_OF_MARK_COMPACT_COLLECTION times a stop-and-copy GC */
{
ratio = 0.99*((heap/live)-1);
if (ratio > LIVE_RATIO)
ratio = LIVE_RATIO;
}
else
ratio = LIVE_RATIO;
res = min64 (s->totalRam + s->totalSwap,
max64 (live * LIVE_RATIO_MIN,
min64 (s->ramSlop * s->totalRam,
live * ratio)));
...
return res;
}
On a self-compile (on a P4, 2.2GHz, 1Gb RAM) this yields the
following timings, with visibly much more stop-and-copying than
mark-compactions.
with new tuning:
================
GC time(ms): 36,150 (18.8%) 178.67user
with old tuning
===============
GC time(ms): 59,410 (30.2%) 210.78user
Feedback welcome of course.
Stephen, please feel free to integrate this in your current branch
if you agree (perhaps under the control of a option if you want to
minimize interference with your current developments).
--
Alain Deutsch, CTO tel.: +33 (0)1 49 65 32 64
PolySpace Technologies fax.: +33 (0)1 49 65 05 77
mailto:deutsch@POLYSPACE.COM http://www.polyspace.com
-------------------------------------------------------
This sf.net email is sponsored by: OSDN - Tired of that same old
cell phone? Get a new here for FREE!
https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel