[MLton] implement _address and _symbol

Matthew Fluet fluet@cs.cornell.edu
Sun, 17 Jul 2005 18:42:19 -0400 (EDT)

> > MLton already has the Pointer_{get,set} primitives, so the only
> > additional thing needed is a primitive to support _address.
> I'm not too clear what exactly a primitive is.

As you note below, it is essentially those operations that have no simpler 
representation in any current intermediate language.  The majority of the 
primitives are preserved through the entire compilation process and 
handled by the codegens, but there are a handful that are eliminated 
earlier in the process.  
The reference source for primitives is  mlton/atoms/prim.sig.

> Shouldn't there already be one if _import # exists?

Sort of.  The  _import #  expression gets translated into the FFI_Symbol 
primitive, whose semantics is "the address denoted by the symbol".

> So, let me see if my very general understanding of MLton is right:
> 	it parses the source into an AST

Right, this corresponds to  mlton/front-end/*  and  mlton/ast/*  files.

> 	it does some kind of scoped binding of names to values
> 	it type checks and associates types with the values

Correct, this corresponds to  mlton/elaborate/*  files.
This process simultaneously rewrites the program into the CoreML 
intermediate language: mlton/core-ml/* files.

> 	it rewrite the whole thing into successively simpler languages
> 	finally, there is a language so simple that it is only calls to
> 		'primitives'
> 	this gets run through the codegen
> If the above is right, then what I need to do is add _address and _symbol as
> keywords to the parser, teach the type checker how to resolve the type of
> these things using the '_symbol "s": type;' syntax, during one of those
> intermediate language steps convert it to a pair of functions that call
> Pointer_{get,set}.

Exactly.  As noted above, the elaborator does both the type-checking and 
a translation step, so you both "teach the type checker" and "convert it 
to a pair" at the same time.

I'm fairly confident that one could accomplish this task without needing 
to touch more than:
 mlton/front-end/ml.lex : add _symbol and _address as keywords
 mlton/front-end/ml.grm : add productions for _symbol and _address
 mlton/ast/ast-core.{sig,fun} : add PrimKind.Address for an AST node 
    corresponding to _address; you'll find that there is already 
    PrimKind.Symbol, which has been serving as the AST node corresponding 
    to _import #.
 mlton/elaborate/elaborate-core.fun : this is where the heavy lifting 
    happens; all of the current FFI primitives are grouped together

> For _address, I probably just need to move whatever _import # did to it.

In fact, _import # was singled out so early in the process, that you 
would simply neeed to adjust the lexer/parser and rename PrimKind.Symbol 
to PrimKind.Address.

> > That and a little bit of front end hacking to expand the syntax into the
> > appropriate lambda expressions should do it.
> Is the front-end one of the language simplification steps?
> (Perhaps the first one?)

It is the general term for the portion of the compiler that includes the
lexer/parser and elaborator.  (In -verbose 2, the parseAndElaborate pass.)
So, Stephen's comment points to modifying elaborate-core.fun to convert
the AST nodes for (well-typed) FFI primitives into an expanded form in the
CoreML IL (i.e., "lambda expressions").  As I noted above, this shouldn't
require any new primitives; the necessary CoreML expression can be built 
up from the existing primitives.