[MLton] HOL's Moscow ML implementation, and pushing MLton to emulate it

Matthew Fluet fluet@cs.cornell.edu
Thu, 7 Apr 2005 22:45:50 -0400 (EDT)

> > As far as extensibility goes, we would particularly like a tool that
> > enabled us to lift entire ML structures (with their own types and
> > operations) on them into the interpreted domain, but with the
> > operations implemented in the interpreter rather than having the
> > source code interpreted.
> Yeah this sounds great.  When I've thought about this sort of thing in
> the past I've imagined a primitive built into MLton that would expose
> the current environment in scope via an interpreter.  

-show-basis gets you part of the way there.

> One could use
> mlbs to produce the precise environment that one wants.  The biggest
> problem I've never resolved to my satisfaction is how to lift
> polymorphic values in the host environment to the hosted environment.
> My intuition for the problem is that MLton's compilation strategy
> requires seeing all the instances of a polymorphic function at compile
> time.  Once one exposes a polymorphic function to the interpreter,
> it's tough to make that true.

I don't understand your concern.  The hosted environment is an interpreter 
running over its own universal datatype.  When you expose the polymorphic 
function to the interpreter, it is specialized to the universal datatype.  
Even if the interpreter supports a typed-language with polymorphism, the 
terms it interpret must all fall under one universal datatype.

Take HaMLet as an example; essentially, it's universal datatype is 

type BasVal = string
datatype Val = ... | BasVal of BasVal | ... 

defined in DynamicObjectsCore.sml.  One would provide a host function to 
the interpreter by choosing a unique string for it and registering it with 
the library, which provides

val APPLY : BasVal * Val -> Val

which applies the registered function to a value.

> A while ago, I read Nick Benton's paper
> (http://mlton.org/References#Benton05) and really liked it, but if I
> recall correctly, its solution to the problem involved coercions.
> These are slow because they take time proportional to the size to the
> data structure being coerced.  Plus, there are difficulties with
> sharing (I don't remember if the paper adequately addressed those).

Nick's paper suggests using the exceptions embedding/projection trick.  
This is a very nice solution to exporting abstract types to the 
interpreter.  Essentially, extend the above with

type BasVal = string
datatype Val = ... | BasVal of BasVal | AbstrU of U.t | ...

Now a host function simply uses the embed/proj pairs to get values out of 
the AbstrU of U.t variant.  Again, the static type-checking performed by 
the interpreter should ensure that the proj never fails.

If you don't want the type abstract in the interpreter, then I think you 
make the choice about whether or not to add the type to the Val 
representation.  That gives you an easy injection of your type into the 
interpreter's universal type, at the expense of complicating the