[MLton] free type variables in datatype decs
Matthew Fluet
fluet@cs.cornell.edu
Wed, 2 Feb 2005 14:35:36 -0500 (EST)
I was going to 'tit-for-tat' Andreas with a HaMLet bug report, noting that
HaMLet does not accept the following program:
val id = fn x =>
let
datatype z = Z of 'a
val unZ = fn Z x => x
in
unZ (Z x)
end;
MLton accepts the program, though I am not entirely convinced that it is
in keeping with the Definition. Section 4.6 (Scope of Explicit Type
Variables) states
"we still have to account for the scope of an explicit type variable
occurring in the ': ty' of a typed expression or pattern or in the
'of ty' of an exception binding. For the rest of this section, we
consider such free occurrences of type variables only."
That would appear to (implicitly) rule out scoping an explicit type
variable from a datatype binding.
Interestingly, the Definition explicitly rules out free type variables in
datatype descriptions in the syntax of Modules, though there is no
analogous restriction in the syntax of the Core.
Andreas' "Mistakes and Ambiguities" posits that the analogous restriction
in the syntax of the Core is necessary; "without it the type system would
be unsound." I am convincing myself that this is the case.
I believe that MLton is currently incorrectly implementing Rule 28.
At issue is the closure operation applied to the value environment
elaborated from the conbinding.
Suppose for the moment that one expanded the criteria in "Scope of
Explicit Type Variables"; then we would attempt to elaborate
val 'a id = fn x =>
let
datatype z = Z of 'a
val unZ = fn Z x => x
in
unZ (Z x)
end;
Applying Rule 29, we obtain the value environment
VE = {Z -> ('a -> z, c)}
We should apply the closure operation to this to obtain
Clos VE = {Z -> (#['a].'a -> z, c)}
However, MLton is applying a slightly modified closure operation (one that
only generalizes over the type variables of the datatype, which in this
case is empty), yielding
{Z -> ('a -> z, c)}
If one continues with the Definition's resulting value environment, then
unZ is elaborated with type #['b].z -> 'b and the entire id is
elaborated with the type #['a,'b].'a -> 'b. It is clear how one can
construct unsound programs in light of such a "cast" function.