[MLton] MLton and Moby
Matthew Fluet
fluet@cs.cornell.edu
Tue, 3 Feb 2004 22:08:04 -0500 (EST)
For no better reason than to see if it could be done, I "ported" John's
latest Moby release to MLton. Also, I wanted an application that used
MLRISC to verify that the "port" of MLRISC to MLton was working. I thought
I would pass along a couple of experiences and observations.
In short, it is possible and the resulting MLton mobyc produces assembly
files that are identical to the SML/NJ mobyc assembly files. MLton mobyc
compiled executables run correctly (at least the very small "Hello World"
example programs I tried).
In the process, I submitted 2 bug reports to SML/NJ (one trivial Basis
bug, one serious type checking bug) and 1 bug report to MLton (an
incorrect elaboration optimization); I also fixed one bug in the MLton
Basis library's IO.
Scattered troughout both the SML/NJ Library and Moby sources were various
SML/NJ-isms:
or-patterns: fairly common, but one quickly learns to recognize
MLton's sytax error; in all cases, I just duplicated the whole
match (p => e) at each of the patterns (rather than lifting the
rhs expression to a function); there were only a few cases where I
think lifting the rhs expression would have been worth it.
vector constants: not many; very easy to change "#[" to
"Vector.fromList [", which accomplishes the right thing.
vector patterns: never encountered.
withtype in signatures: Somewhat common; probably the most annoying
deviation to deal with, as it is obvious what it's meaning should
be. Is there a reason that the Definition disallows it?
sequential withtype exapansions: a few. All but one of them could be
rewritten as a sequence of type definitions and datatype
defintions. One example of this is the code generated by
the ASDLgen tool. Hence, the ASDLgen does not generate SML.
higher-order functors: MLRISC makes a big deal about using them, but
every higher-order functor defintion and application could be
uncurried in the obvious way. That is, every use of a
higher-order functor is fully applied.
Basis differences: the only one I encountered was the Moby mbi utilities
assuming Int.int == Position.int; interestingly, I didn't
encounter any problems with Int.int = Int31.int in SML/NJ and
Int.int = Int32.int in MLton.
where <str> = <str>: the most common; very prevelant in the MLRISC
sources and in the Moby MLRISC interface code. It can be pretty
painful to expand out all the flexible types in the respective
structures. Furthermore, many of the implied type equalities
aren't needed, but it's too hard to pick out the right ones. One
particularly bothersome idiom in the MLRISC sources was the
following:
signature CELLS_BASIS
structure CellsBasis : CELLS_BASIS
(that is, a signature and a global structure constrained by that
signature). Fairly common are signatures like:
sig
...
structure CB : CELLS_BASIS = CellsBasis
...
end
This expands to
sig
...
structure CB : CELLS_BASIS
where type cellKindInfo = CellsBasis.cellKindInfo
and ...
...
end
for about 8 more constraints. Since CellsBasis is global,
it would have been easy enough to coded the idom as follows:
signature PRE_CELLS_BASIS
structure CellsBasis : PRE_CELLS_BASIS
signature CELLS_BASIS =
PRE_CELLS_BASIS
where type cellKindInfo = CellsBasis.cellKindInfo
and ...
However, I don't know if it's worth refactoring the MLRISC code.
Many of these "where <str> = <str>" constraints are repeated,
so I spent a lot of time cut-n-paste-ing.
While I know MLton won't support "where <str> = <str>" in the future, I
toyed with the idea of hacking something together so that I could get
MLton to print out the complete, elaborated signature for a signature
identifier, so that I could see all the types, rather than hunt thorugh
the sources by hand. SML/NJ's top-loop probably could have done the same
thing for me.