[MLton] mlb files and the ML Kit

Martin Elsman mael@itu.dk
Mon, 15 Mar 2004 17:34:43 +0100


Stephen Weeks <sweeks@sweeks.com> writes:

> The only way I see to achieve equality between the expressiveness of
> mlb files and basis expressions is to allow arbitrarily deep bases as
> their denotation.
>
> 	Basis2 = (Bid -> Basis2) x Basis0
>
> This is completely analogous with the denotation of structures in SML	
>
> 	Env = (Strid -> Env) x TyEnv x ValEnv
>
> Strictly keeping the analogy with SML structures, we have the
> following language.
>
> <bdec> ::= basis <bid> = <bexp>
>          | local <bdec> in <bdec> end
>   	   | open <bid>*
>          | <bdec> <bdec>
>          | <empty>
> <bexp> ::= <bid>
>          | bas <bdec> end
>          | let <bdec> in <bexp> end

Good points! I like your proposal. The analogy with the SML syntax for
structures is also nice. Keeping the analogy is probably better than
supporting more compact basis files.

> Now, where to add .mlb and .sml files to the grammar?  Since an mlb
> file denotes an element of Basis2, we might expect that it would
> contain a single <bexp> and it should be allowed to refer to an mlb
> file in the grammar for <bexp>'s.  However, for syntactic convenience,
> I think it is easier to go the other way.  I would like the contents
> of an mlb file to be a <bdec> and to refer to mlb files directly as a
> <bdec>.
>
> <bdec> ::= ... | file.mlb
>
> mlb files still denote elements of basis2.  This syntactic convenience
> is simply the implicit wrapping of a "bas ... end" around the contents
> of an mlb file and the implicit prefixing of an "open" wherever an mlb
> file is referenced.  
>
> For similar reasons, I would like to include sml files as <bdec>'s.
>
> <bdec> ::= ... | file.sml
>
> The net result is that the common case of a list of sml and mlb files
> is itself valid contents of an mlb file, requiring no additional
> syntax.

This seems like the right solution. It is also worth noticing that the
analogy with structures would be weakened if a file (.mlb or .sml)
fell into the <bexp> class, because we would need to extend the <bexp>
class with <bexp> ::= <bexp> <bexp> to allow for lists of
sml-files. Another problem with my own proposal was that the open
construct didn't have the "parallel" semantics (as the open construct
for core declarations).

> With the above syntax, Martin's first two examples are written as
>
>    my.mlb
>      open A.sml B.sml C.sml
>    becomes
>      A.sml B.sml C.sml
>
>    my2.mlb
>      bas A = A.sml
>      bas B = let open A in B.sml end
>      bas C = let open A in C.sml end
>      open A B C
>    becomes
>      basis A = bas A.sml end
>      basis B = let open A in bas B.sml end end
>      basis C = let open A in bas C.sml end end
>      open A B C
>
> Here's a static semantics for the language, including the handling of
> caching of mlb files and elaboration of sml files.
>   
>     b in Bid
>     d in Bdec
>     e in Bexp
>     C in (File.sml -> Topdec) x (File.mlb -> Bdec)
>     B in Basis = (Bid -> Basis) x FunEnv x SigEnv x Env
>     F in (File.mlb -> Basis + {NONE})
>
>     Judgement: B, C, F |- e --> B', F'
>
>     ------------------------
>     B, C, F |- b --> B(b), F
>
>     B, C, F |- d --> B', F'
>     -------------------------------
>     B, C, F |- bas d end --> B', F'
>
>     B, C, F |- d --> B', F'  B + B', C, F' |- e --> B'', F''
>     --------------------------------------------------------
>     B, C, F |- let d in e end --> B'', F''
>
>
>     Judgement: B, C, F |- d --> B', F'
>
>     B |- C(file.sml) => B'
>     -----------------------------
>     B, C, F |- file.sml --> B', F
>
>     F(file.mlb) = B' in Basis
>     -----------------------------
>     B, C, F |- file.mlb --> B', F
>
>     F(file.mlb) = NONE    {}, C, F |- C(file.mlb) --> B', F'
>     --------------------------------------------------------
>     B, C, F |- file.mlb --> B', F'[file.mlb -> B']
>
>     B, C, F |- e --> B', F'
>     -----------------------------------------
>     B, C, F |- basis b = e --> [b |-> B'], F'
>
>     B, C, F |- d1 --> B', F'  B + B', C, F' |- d2 --> B'', F''
>     ----------------------------------------------------------
>     B, C, F |- local d1 in d2 end --> B'', F''
>
>     ----------------------------------------------------
>     B, C, F |- open b1 ... bn --> B(b1) + ... + B(bn), F
>
>     B, C, F |- d1 --> B', F'  B + B', C, F' |- d2 --> B'', F''
>     ----------------------------------------------------------
>     B, C, F |- d1 d2 --> B' + B'', F''
>
>     ----------------------
>     B, C, F |-   --> {}, F
>
>
>> BTW: should it be allowed to specify a source file twice in an
>> mlb-file? Or in an entire project?
>
> It seems OK to me.  The meaning I proposed for that a while back was
> to duplicate (and re-elaborate) the file contents at each reference.
> I've kept that meaning in the semantics above.
>
>> Also, would it be bad to require mlb-file names to be unique
>
> Yes, that would be bad :-).
>
>> - or should we require only absolute paths to mlb-files to be
>> unique.
>
> I'm not sure what this means.  We will certainly traverse symbolic
> links.  How could absolute paths not be unique?

We could perhaps issue a warning if a file (is mentioned multiple
times). In most cases, I belive it would be a mistake, but perhaps the
programmer would use the feature to write poor-man functors. 

>> To obtain a simple mechanism for generating unique machine code
>> labels (and new type names, for that matter), it would be great if
>> the concatenation of an mlb-file name and a source file name
>> uniquely determines the source file.
>
> I think with the above semantics it will uniquely determine an
> instance of the source file, which I would think is enough.

I don't think that is enough. Consider the mlb-file:

 bad.mlb:
  A.sml B.sml C.sml D.sml B.sml E.sml

and the content map

 A.sml:  type t0 = int
 B.sml:  datatype t = T of t  
 C.sml:  val a = T 4
 D.sml:  type t0 = bool
 E.sml:  val _ = case a of T b => print (Bool.toString b)

Here the ``type name'' "B.sml-t" is not sufficient to ensure type
soundness... But I think I can use the position in the mlb-file to get
things to work.

Cheers,

Martin