[MLton] Experimental 64-bit binary package (& SVN sources)

Wed Feb 28 23:06:22 PST 2007

I have posted an experimental amd64-linux (64-bit mode) binary package
to the usual place:
   http://www.mlton.org/Experimental
The corresponding sources can be checked out with:
   svn co svn://mlton.org/mlton/branches/on-20050822-x86_64-branch 
mlton.svn.x86_64
(You can skip the binary package and use a 32-bit build (preferably
from a recent trunk check out) to bootstrap.)

This isn't a fully polished release.  However, it does pass all the
regressions and bootstraps, and there are no known bugs.

Currently, I'm mostly interested in the correctness of produced code;
please report any compile-time assertions or bugs in the resulting
executables.  I'm less interested in portability and configuration
issues.  I'm least interested in performance comparisons.

Items of note:

  * Only the C-backend is supported on amd64-*.

  * Building the bytecode interpreter in 64-bit mode is broken.  For
    expediency, I've simply disabled the bytecode backend entirely (for
    all platforms).

  * All object-pointers are 64-bits.

    + Since the forwarding object-pointer for the copying collector is
      written over the initial bytes of the object, all objects must
      have size at least equal to the size of an object-pointer.

    + Since the mark-compact collector threads object-pointers through
      the object headers, all object headers are 64-bits.

    + Since an object header must be distinguishable from an array
      counter (used for dfs marking), an array counters are 64-bits.

    + Array lengths are also 64-bits.

    + This means that a zero-length array consumes 32-bytes on a 64-bit
      platform (whereas it consumed 16-bytes on a 32-bit platform).

  * For some datatypes with variants carrying a Real32.real, we are
    currently forced to box the Real32.real.  The packed representation
    would normally like to shift and tag variants carrying values
    smaller than an object pointer (the low-order tag bit can be used
    to distinguish this variant from a pointer variant).  However, in
    order to shift and tag a Real32.real, we must cast the Real32 to a
    Word32, zero extend the Word32 to a Word4, then shift and tag the
    Word64.  Unfortunately, we don't have a *bitcast* from Real32 to
    Word32; naively emitting
      Real32 r = ...;
      Word32 w = (Word32)r;
    results in coercion that attempts to preserve the numeric meaning
    of r, rather than the bits of r.  Simply more evidence that we need
    real/word bitcast primitives.

  * As noted above, array lengths are 64-bits.  Indeed, all the
    primitive sequence operations (Array_array, Array_sub,
    Array_update, Vector_update) use 64-bit integer lengths/indices.

    However, the default ML Int.int type is still Int32.int.  Hence,
    Array.maxLen and array lengths/indices are 32-bits from user code.

    Using '-default-ty int64', one would set the default ML Int.int
    type to Int64.int.  Then one could create and index arrays with
    more than 2^31 elements.  However, I haven't tested this, and I
    think that the runtime aborts if one tries to allocate more than
    0wx7FFFFFFF bytes in one object.