[MLton] HOL's Moscow ML implementation, and pushing MLton to emulate it

Michael Norrish Michael Norrish <Michael.Norrish@nicta.com.au>
Thu, 24 Mar 2005 12:33:55 +1100

In an earlier message

I described the basic problem as a combination of three elements: 

A. Implementation of a small ML kernel to prove theorems
B. A persistence mechanism for values of type "thm" (theories)
C. A way to accompany theories with code that is dynamically loaded

-- Moscow ML ----------

Moscow ML gives us two solutions to the above problem, both of which
we use.  

The first is the interactive loop.  Here, we start up the loop and are
able to dynamically load the files and code that are relevant to the
task at hand.  This much is easy. 

The second is the batch-compiler.  The representation of theories is
as SML structures (containing lots of val bindings all of type "thm"),
so that theories (B) and code (C) are represented uniformly.  These
can also be separately compiled, making it possible to load theories
and code independently in the interactive loop above.  Moscow ML loses
the ability to dynamically load code in its compiled executables, but
this doesn't matter, because the compiler/linker is fast enough to be
called to create new executables (linking different theories and
libraries) as required.

-- MLton ----------

MLton doesn't have an interactive loop, but even if we were to
implement an interactive system ourselves (and either invented our own
domain-specific language for doing proof, or provided our own ML
interpreter, as Scott Owens suggested), we still wouldn't be able to
dynamically load code that was appropriate to the theorem-proving task
at hand.  If we switched our theory representation away from SML
structures to data files, we would at least be able to dynamically
load theorems from disk.

Joe Hurd experimented with getting MLton to emulate the batch-compiler
aspect of HOL.  There's lots of repeated work there because the same
files keep getting recompiled to produce slightly different
executables as the build progresses, but it should be theoretically
possible.  Unfortunately, the compiler hates our big theory files,
which are sometimes rather large ('trivial' val-bindings, but lots of
them), and even this much doesn't really work.  If we changed our
file-format on disk, we would solve that problem.

-- Creating a monolith ----------

To create a MLton-based, possibly interactive system, supporting the
various special-purpose bits of code we've written, it seems we will
have to do the following:

 * change to use a data-file format for theorems on disk
 * implement our own proof language 
 * eventually build one monolithic executable that includes all of the
   code that accompanies various theories.  For example, the code for
   deciding linear arithmetic over the real numbers will always be
   present, even if the user is only working with sets of natural
 * provide for relatively easy creation of new versions of the
   monolith, so that users can add their own special-purpose code.
   (This is not likely to happen that often.)

Though it seems a big task, I am sorely tempted to try it.  Thanks for
giving me the opportunity to write out my ideas.  If you have any
comments on the architectures I've described, please get in touch.