[MLton] implement _address and _symbol
Matthew Fluet
fluet@cs.cornell.edu
Sun, 17 Jul 2005 18:42:19 -0400 (EDT)
> > MLton already has the Pointer_{get,set} primitives, so the only
> > additional thing needed is a primitive to support _address.
>
> I'm not too clear what exactly a primitive is.
As you note below, it is essentially those operations that have no simpler
representation in any current intermediate language. The majority of the
primitives are preserved through the entire compilation process and
handled by the codegens, but there are a handful that are eliminated
earlier in the process.
The reference source for primitives is mlton/atoms/prim.sig.
> Shouldn't there already be one if _import # exists?
Sort of. The _import # expression gets translated into the FFI_Symbol
primitive, whose semantics is "the address denoted by the symbol".
> So, let me see if my very general understanding of MLton is right:
> it parses the source into an AST
Right, this corresponds to mlton/front-end/* and mlton/ast/* files.
> it does some kind of scoped binding of names to values
> it type checks and associates types with the values
Correct, this corresponds to mlton/elaborate/* files.
This process simultaneously rewrites the program into the CoreML
intermediate language: mlton/core-ml/* files.
> it rewrite the whole thing into successively simpler languages
> finally, there is a language so simple that it is only calls to
> 'primitives'
> this gets run through the codegen
>
> If the above is right, then what I need to do is add _address and _symbol as
> keywords to the parser, teach the type checker how to resolve the type of
> these things using the '_symbol "s": type;' syntax, during one of those
> intermediate language steps convert it to a pair of functions that call
> Pointer_{get,set}.
Exactly. As noted above, the elaborator does both the type-checking and
a translation step, so you both "teach the type checker" and "convert it
to a pair" at the same time.
I'm fairly confident that one could accomplish this task without needing
to touch more than:
mlton/front-end/ml.lex : add _symbol and _address as keywords
mlton/front-end/ml.grm : add productions for _symbol and _address
mlton/ast/ast-core.{sig,fun} : add PrimKind.Address for an AST node
corresponding to _address; you'll find that there is already
PrimKind.Symbol, which has been serving as the AST node corresponding
to _import #.
mlton/elaborate/elaborate-core.fun : this is where the heavy lifting
happens; all of the current FFI primitives are grouped together
> For _address, I probably just need to move whatever _import # did to it.
In fact, _import # was singled out so early in the process, that you
would simply neeed to adjust the lexer/parser and rename PrimKind.Symbol
to PrimKind.Address.
> > That and a little bit of front end hacking to expand the syntax into the
> > appropriate lambda expressions should do it.
>
> Is the front-end one of the language simplification steps?
> (Perhaps the first one?)
It is the general term for the portion of the compiler that includes the
lexer/parser and elaborator. (In -verbose 2, the parseAndElaborate pass.)
So, Stephen's comment points to modifying elaborate-core.fun to convert
the AST nodes for (well-typed) FFI primitives into an expanded form in the
CoreML IL (i.e., "lambda expressions"). As I noted above, this shouldn't
require any new primitives; the necessary CoreML expression can be built
up from the existing primitives.