[MLton] MLton and Moby

Matthew Fluet fluet@cs.cornell.edu
Tue, 3 Feb 2004 22:08:04 -0500 (EST)


For no better reason than to see if it could be done, I "ported" John's
latest Moby release to MLton.  Also, I wanted an application that used
MLRISC to verify that the "port" of MLRISC to MLton was working. I thought
I would pass along a couple of experiences and observations.

In short, it is possible and the resulting MLton mobyc produces assembly
files that are identical to the SML/NJ mobyc assembly files.  MLton mobyc
compiled executables run correctly (at least the very small "Hello World"
example programs I tried).

In the process, I submitted 2 bug reports to SML/NJ (one trivial Basis
bug, one serious type checking bug) and 1 bug report to MLton (an
incorrect elaboration optimization); I also fixed one bug in the MLton
Basis library's IO.

Scattered troughout both the SML/NJ Library and Moby sources were various
SML/NJ-isms:
  or-patterns: fairly common, but one quickly learns to recognize
	MLton's sytax error;  in all cases, I just duplicated the whole
	match (p => e) at each of the patterns (rather than lifting the
	rhs expression to a function); there were only a few cases where I
        think lifting the rhs expression would have been worth it.
  vector constants: not many; very easy to change "#[" to
	"Vector.fromList [", which accomplishes the right thing.
  vector patterns: never encountered.
  withtype in signatures: Somewhat common; probably the most annoying
	deviation to deal with, as it is obvious what it's meaning should
	be.  Is there a reason that the Definition disallows it?
  sequential withtype exapansions: a few.  All but one of them could be
	rewritten as a sequence of type definitions and datatype
	defintions.  One example of this is the code generated by
        the ASDLgen tool.  Hence, the ASDLgen does not generate SML.
  higher-order functors: MLRISC makes a big deal about using them, but
	every higher-order functor defintion and application could be
	uncurried in the obvious way.  That is, every use of a
	higher-order functor is fully applied.
  Basis differences: the only one I encountered was the Moby mbi utilities
	assuming Int.int == Position.int; interestingly, I didn't
	encounter any problems with Int.int = Int31.int in SML/NJ and
	Int.int = Int32.int in MLton.
  where <str> = <str>: the most common; very prevelant in the MLRISC
	sources and in the Moby MLRISC interface code.  It can be pretty
	painful to expand out all the flexible types in the respective
	structures.  Furthermore, many of the implied type equalities
	aren't needed, but it's too hard to pick out the right ones.  One
	particularly bothersome idiom in the MLRISC sources was the
	following:
		signature CELLS_BASIS
		structure CellsBasis : CELLS_BASIS
	(that is, a signature and a global structure constrained by that
	signature).  Fairly common are signatures like:
		sig
		  ...
		  structure CB : CELLS_BASIS = CellsBasis
		  ...
		end
	This expands to
		sig
		  ...
		  structure CB : CELLS_BASIS
				 where type cellKindInfo = CellsBasis.cellKindInfo
                                   and ...
		  ...
                 end
	for about 8 more constraints.  Since CellsBasis is global,
	it would have been easy enough to coded the idom as follows:
		signature PRE_CELLS_BASIS
		structure CellsBasis : PRE_CELLS_BASIS
		signature CELLS_BASIS =
		  PRE_CELLS_BASIS
		  where type cellKindInfo = CellsBasis.cellKindInfo
		    and ...
	However, I don't know if it's worth refactoring the MLRISC code.
        Many of these "where <str> = <str>" constraints are repeated,
        so I spent a lot of time cut-n-paste-ing.


While I know MLton won't support "where <str> = <str>" in the future, I
toyed with the idea of hacking something together so that I could get
MLton to print out the complete, elaborated signature for a signature
identifier, so that I could see all the types, rather than hunt thorugh
the sources by hand.  SML/NJ's top-loop probably could have done the same
thing for me.