[MLton] share bug

Henry Cejtin henry.cejtin@sbcglobal.net
Tue, 18 Apr 2006 18:39:12 -0500

I have run my program compiled with
    -debug true
using  a  version  of  gc.c with DEBUG_SHARE set to TRUE and with the command
line arguments
    @MLton gc-messages --
The resulting stderr output is 3.6 gigabytes.  From  my  looking  at  it,  it
seems that the first GC after the first call to share got the segfault.

The text just at the first share call is

    GC_share 0x600b3398
    using minor space
    maxElementsSize = 10436234
    elementsIsInHeap = TRUE
    elementsSize = 8388608
    0x08074018 = newTable ()
    hashCons (0x600ba678)
    0x600ba678 = hashCons (0x600ba678)
    hashCons (0x600ba680)
    0x600ba680 = hashCons (0x600ba680)
    hashCons (0x68d8d808)
    tableInsert (3941444285, 0x68d8d808, TRUE, 0x00000003, 0x68d8d814)
    probe = 0x0038075b
    slot = 0x0038075b
    numProbes = 1
    0x68d8d808 = hashCons (0x68d8d808)

The information at the end is

    hashCons (0x600b3398)
    tableInsert (368121331, 0x600b3398, TRUE, 0x00000055, 0x600b367c)
    probe = 0x01ff36dd
    slot = 0x01ff36de
    slot = 0x03fe6dbb
    0x600b3398 = hashCons (0x600b3398)
    32,016,852 bytes hash consed (5.7%).
    Starting gc.  Request 0 nursery bytes and 4,000,012 old gen bytes.
    Minor GC.
    Minor GC done.  294,392 bytes copied.
    Finished gc.
    time: 6,890 ms
    old gen size: 896,669,008 bytes (84.3%)
    gc.c: assertIsInFromSpace p = 0x665aeab4  *p = 0x9ac95540);

Just  before  the first call to share I print out the result of MLton.size on
the object that is about to be shared, and just after I do the same.  Both of
these  succeed,  and the size before is 561,950,212 bytes, and the size after
is 529,933,360 bytes.

Looking at lines of the form
    ?output? = hashCons(?input?)
there are 11,682,512 calls and the same number  of  ?input?  values,  ranging
from  0x600b3398  to  0x9e0b2b00.   There  are  10,022,696  ?output?  values,
covering the same range.  Every ?output? value was also a ?input? value.  The
value in the assert
    p = 0x665aeab4
was not one of the ?input?s, but the second one
    *p = 0x9ac95540
was both a ?input? and a ?output? value.

(Not  that  it  is at all important, but note the missing open paren or extra
close paren in the assert error message.)

Any way, any notions of what is going on?