[MLton] x86_64 port status

Thu, 22 Dec 2005 10:51:02 -0500 (EST)

Status of the x86_64 port of MLton.
=======================================================================

Sources:

Work is progressing on the x86_64 branch; interested parties may check
out the latest revision with:

svn co svn://mlton.org/mlton/branches/on-20050822-x86_64-branch mlton.x86_64

and view the sources on the web at:

http://mlton.org/cgi-bin/viewsvn.cgi/mlton/branches/on-20050822-x86_64-branch/

Background Reading:

(* Representing 64-bit pointers. *)
http://mlton.org/pipermail/mlton/2004-October/026162.html
(* MLton GC overview *)
http://mlton.org/pipermail/mlton/2005-July/027585.html

Executive Summary:

The runtime system (i.e., garbage collector and related services) has
been rewritten to be configurable along two independent axes: the
native pointer size and the ML heap object pointer size.  There are no
known functionality or performance regressions with respect to the
rewritten runtime and the mainline runtime.

The next step will be modify the Basis Library implementation (on both
the SML and C sides) to be agnostic to the native representation of
primitive C-types (e.g., int, long); this is important for getting the
right representation for file descriptors, etc.  This step ensures
that the Basis Library implementation may be shared between 32-bit and
64-bit systems.

Following that, it should be possible to push changes through the
compiler proper to support a C-codegen in which all pointers are
64-bit.  After shaking out bugs there, we should be able to consider
supporting smaller ML-pointer representations and a simple native
codegen.

Technical Details:

Thus far, code modifications have been limited to the runtime/
directory:

http://mlton.org/cgi-bin/viewsvn.cgi/mlton/branches/on-20050822-x86_64-branch/runtime/

The new gc/ sub-directory breaks down the GC implementation into
smaller pieces.  For efficiency, they are #include-ed together to form
a single compilation unit to feed to the C compiler.

A key design decision has been to implement the GC in a manner that is
agnostic to the native pointer size and to the ML heap object pointer
representation.  The file model.h encapsulates the key attributes that
describe an ML heap object pointer representation, and the files
objptr.{h,c} encapsulate the conversions between native pointers and
ML heap object pointers.  In most places, such conversions are
relatively routine.  One major exception is that some care must be
taken with threading of internal pointers for the Jonker's
mark-compact GC, since it must compensate for the possibility that an
ML-pointer is not the same size as an ML-header (see, the file
mark-compact.c).

Similarly, any assumptions about the native WORD_SIZE have been
removed.  All object sizes are measured in 8-bit bytes and stored in
size_t variables.  Statistics are gathered in uintmax_t and intmax_t
variables.

The C-side of the Basis Library implementation is entirely agnostic to
the representation of ML objects (pointers, headers, etc.).  That is,
the FFI assumes that all ML heap object pointers are passed by their
native pointer representation.  Consequently, all functions exported
by the GC to the Basis Library are expressed in terms of native
pointers.

As stated above, the next step will be modify the Basis Library
implementation (on both the SML and C sides) to be agnostic to the
native representation of primitive C-types (e.g., int, long).  I
believe it will be worthwile to follow the technique used in the
MLNLFFI-library implemantation.  There, we use two ML Basis path
variables (TARGET_ARCH, TARGET_OS) to choose the correct ML
representation for primitive C types.  To put it another way, we can
veiw the Basis Library implementation as a functor parameterized by
the sizes of (and primitives supporting) the C types.  Since it is too
hard to actually use a functor, we use an MLB-style functor instead.

The IntInf.int implementation (using the GNU MP library) will also
require some revision.

Currently, runtime/basis/IntInf.c requires deeper knowledge of the
representation of ML objects and the garbage collector state than any
other C-side Basis Library implementation file.  For example, IntInf.c
gets to directly set the heap frontier in garbage collector state,
without going through a function exported from gc.o.  Likewise, it
gets to directly access the length of the array (i.e., knowing the
object layout).  No other portion of the Basis Library implementation
requires this info.  There is probably a big performance cost to
abstracting this stuff away, so the plan is to either directly fold
IntInf.c into the garbage collector implementation or to at least
compile an instance of IntInf.c for each instance of gc.o.

[[

The above issue arises out of the open question of how to best package
the runtime for supporting separate ML object representation.  The
choice of ML object representation will be a compile time decision
made by the user, but will also require linking to a runtime that
understand the representation.  Currently, since there is exactly one
representation, we deliver a single libmlton.a library that includes
the garbage collector (and related services) and also the C-side of
the Basis Library implementation.

There are two obvious choices for supporting multiple ML object
representations.
  1) Deliver multiple  libmlton.repN.a  libraries, again combining the
  garbage collector and the Basis Library.
  2) Deliver a single  libmlton-basis.a  library, with a representation
  agnostic implementation of the C-side of the Basis Library
  implementation, and multiple  libmlton-gc.repN.a  libraries.

The second option is preferred, as it preserves the abstraction
between ML and C.  The only exception is that IntInf.c would need to
be moved over to libmlton-gc.a, which seems acceptable, as it
performance requires it to be specialized along with the rest of the
GC.

]]

The other IntInf issue is to robustly support an efficient
representation.  Currently, IntInf.int is represented like:

  datatype t = Small of Int31.int
             | Big of Word32.word vector

The bottom bit suffices to distinguish a 31-bit signed integer from
the (ML heap object) pointer to the vector, so we acheive a very
compact representation.

Going to a 64-bit system, both the Small and Big representations may
change.  To maintain the efficient representation, the Small
representation should correspond to the number of bits used to
represent an ML heap object pointer, which could be 32 or 64.
Orthogonally, on a 64-bit system, the GNU MP library uses 64-bit words
to represent a bignum, so the Big representation would also change.

One way to accomodate the GNU MP library would be to change the Big
representation to
   Big of Word8.word vector
and require that the implementation maintain the length of the vector
at a multiple of the limb size (plus the sign bit/word), which would
be exposed as a compile time constant.