[MLton] mlb support

Mon, 28 Jun 2004 14:06:14 -0700

> > <ann> ::= showBasis <file>
> >         | showBasisUsed <file>
> >         | showDefUse <file>
> >         | warnUnused
> 
> I would kick showBasis out of this group.  It is not def-use related; in
> particular, the basis defined by a particular bdec remains the same,
> regardless of how the bdec is used.

I agree that showBasis is not def-use related and the basis is
constant, but I still think it makes perfect sense to write

	ann showBasis foo in bdec end

to cause the basis that bdec defines to be written to foo.  And having
an annotation provides finer-grained control than a command-line
switch.

> Also, I'm not keen on the showX <file> annotations; seems weird to
> produce these files just based on the annotations.

I agree it's strange to cause side effects with the annotations. But,
it's not a showstopper to me.  Also, we're already causing elaboration
side effects with the other annotations (allowConst, rebindEquals).

> z.mlb:
> 
>   local
>     basis B = bas  basis-2002.mlb  end
>     basis B1 = bas  ann showBasisUsed a1.basis-used in open B end  end
>     basis B2 = bas  ann showBasisUsed a2.basis-used in open B end  end
>   in
>     local open B1 in a1.sml end
>     local open B2 in a2.sml end
>   end
> 
> I'd like to believe that this will write to a1.basis-used the portion of
> the Basis Library used in a1.sml and write to a2.basis-used the portion of
> the Basis Library used in a2.sml, 

That does seem logical.

> but don't think that will happen with the infrastructure we have. 

Yep.

> I suspect that this will actually write nothing to either of the
> files, because no new decs are introduced by the annotations,
> despite the fact that new bindings are introduced.

For "ann showBasisUsed foo in bdec end", I think it would be pretty
easy to save the basis produced by bdec, then to wait until the
elaboration finishes, walk through the saved basis and write out the
used information.  We could even have showBasisUsed make a copy (in
the sense of def-use information) of the basis that bdec produces so
that we get what you want above.  But I agree that could get messy and
isn't clearly necessary.  So let's keep shooting at something simpler.

Let's see if we can work from the other direction -- starting with
what our infrastructure can easily provide and developing some
annotations with reasonable semantics.

The data our infrastructure stores:

1. With every binding (of a variable, tycon, strid, ...), we keep two
   pieces of information (Uses.t):
   a. a list of all the occurrences where it is used
   b. a bool indicating whether or not the variable may be reported as
      unused via -warn-unused.

2. With each structure component, we keep the same information as with
   the binding that got stored in the structure.

3. For each name space, we keep a list of all the identifiers that
   have been defined (at any scope).

The operations that our infrastructure provides:

1. Env.clearDefUses, which 
   a. for each namespace, sets the list of defined identifiers to [].
   b. for each defined symbol and each binding that occurs in a
      structure, clear the use information.

2. Env.forceUsed, which for each defined symbol and each binding that
   occurs in a structure, set the bool associated with the binding so
   that it cannot be reported as unused.

3. Env.{layout,layoutCurrentScope,layoutUsed}, which are various ways
   of displaying environments.

One natural thing to do is to reify the operations that we have and
make them available in the mlb files.  Something like

bdec ::= clearDefUses
       | forceUsed
       | layout <file>
       | layoutCurrentScope <file>
       | layoutUsed <file>

Then, the expansion of f.sml on the command line becomes:

f.sml	===>	local
		   $(SML_BASIS)/basis-2002.mlb
                   clearDefUses
		in
		   f.sml
		end
                forceUsed

This will cause -warn-unused to only display def-use information for
the user program.

On the other hand, the expansion of no file on the command line
becomes:

<no file>  ==>  $(SML_BASIS)/basis-2002.mlb

Hence, -warn-unused will display def-use information for the basis
library.

One problem with this approach is that Env.clearDefUses don't scope
well.  That is, it clears the namespace definition lists completely,
not just what's in the local scope.  Maybe it would be cleaner to have
something like what I said before

	ann ::= keepDefs {false|true}

With keepDefs false, the elaborator simply wouldn't add bindings to
the namespace lists.  With this, we could expand f.sml to

f.sml	==>	local
		   ann keepDefs false in $(SML_BASIS)/basis-2002.mlb end
		in
		   f.sml
		end
                forceUsed

I guess forceUsed could be an annotation too.

f.sml	==>	local
		   ann keepDefs false in $(SML_BASIS)/basis-2002.mlb end
		in
		   ann forceUsed in f.sml end
		end

>     + suppose I have a pile of code from somewhere, that I want to use
>       in a library.  I'm a principled programmer, so I always compile with
>       -warn-match true & -sequence-unit true.  Unfortunately, the author
>       of the other code wasn't so principled, and I'd love to be able to
>       include their code in a mlb as follows:
> 
>       proj.mlb
>          ann warnMatch false, sequenceUnit false in util.mlb end
>          ann warnMatch true, sequenceUnit true in mycode.sml end

This is easy.  For sequenceUnit, simply keep track of the current
value of the bool in the elaborator and use it instead of
!Control.sequenceUnit when deciding whether to warn in
elaborate-core.fun.  For warnMatch, in the CoreML that comes out of
the elaborator, record the current value of the bool in every Case
expression.  Then, the code that generates the warnings in
defunctorize.fun can use the bool in the Case expression instead of
the global Control value to decide whether or not to warn.

>       Note, I wouldn't expect the warnMatch true and sequenceUnit true
>       annotations to generate warnings if the appropriate options weren't
>       set on the command line.  So, one probably wouldn't use such
>       annotations unless they were buried under a corresponding false
>       annotation.

Dunno.  It might be more natural to have the outermost annotation be
set by the command-line switch, and then have the completely normal
fluidLet behaviour underneath.  So, setting warnMatch true will
*always* cause the warning to be displayed, regardless of the
command-line switch.

>     + I might also develop my code with -warn-unused true, but doing so
>       leads to lots of errors from the old code.  Could we accomodate
>       turning off the warning in the old code?

Sure, I think the keepDefs annotation does this nicely.

Here's our status as I see it.  

-show-basis
	we could only allow it on the command line or we could have it
	as side-effecting annotation.  Neither of those is problematic
	from an implementation perspective.

-show-basis-used
	To support this, we appear to need an annotation like
	layoutUsed.

-show-def-use
-warn-unused
	We have a reasonable solution via the keepDefs annotation,
	which has a simple implementation as a fluidLet bool ref.
	This allows us to recover our old behavior, and to do new
	stuff like selectively disable unused warnings for old code.

I think it would be fine for a first cut to have

ann := allowConst {false|true}
     | rebindEquals {false|true}
     | keepDefs {false|true}
     | sequenceUnit {false|true}
     | warnMatch {false|true}

Then, we can implement -show-basis, -show-def-use, and -warn-unused as
command line options.  For now, we can drop -show-basis-used.  Once we
get a better feel for things, we can add in layoutX annotations for
that and possibly for showBasis.