[MLton-user] a new approach to type-indexed values in SML

Stephen Weeks sweeks at sweeks.com
Fri Sep 29 17:39:55 PDT 2006


> I haven't yet had the time to read your post (or look at the code), but
> are you aware of the following (Haskell) papers:
> 
>   Generics for the Masses
>   Ralf Hinze
>   http://www.informatik.uni-bonn.de/~ralf/publications/ICFP04.pdf
> 
>   TypeCase: A Design Pattern for Type-Indexed Functions
>   Oliveira and Gibbons
>   http://citeseer.ist.psu.edu/oliveira05typecase.html

Yes.

> The second technique explained in Hinze's paper is a close relative
> of the value dependent encoding technique in SML.

Yes, and the first technique is a close relative of the
value-independent approach.

> The way I understand them, type classes do not (really) increase the
> expressive power of the type system.  Haskell techniques that
> (initially) seem to do things with type classes that appear
> impossible in SML generally (upon closer inspection) seem to also
> use either polymorphic recursion or higher-rank polymorphism.
...
> What type classes do provide is a kind of inference mechanism that
> allows many (but not all) things to be constructed implicitly as
> guided by the types.

I agree with your assessment.  Type classes are a convenient way to
implicitly apply higher-rank polymorphic functions.

> Actually, I'm not even entirely sure whether type classes really fit
> nicely into the imperative world of ML.  It seems intuitively
> important to me that the construction of type class dictionaries is
> free of side-effects, because it happens implicitly.

I agree that it is easier to reason about things if dictionary
construction is side-effect free.  We are not alone.  Here's what
Dreyer et al have to say on the subject in "Modular Type classes":

  http://mlton.org/References#Dreyer06

  it is important that the application of an instance functor does not
  engender any computational effects, such as I/O or
  non-termination.  We therefore require that instance functors be
  total in the sense that their bodies satisfy something akin to ML’s
  value restriction.  This restriction appears necessary in order to
  ensure predictable program behavior.

The intended (and actual) use of tivs and inductive rules in my note
follows this, albeit without any guarantee from the type system.
Separation of side effects is one reason why the tiv "apply" function
is specified in a staged way as it is:

  val apply: 'a Type.t -> 'a Value.t

So, for example, for a use of the equality tiv

  Equals.apply (ListTycon.ty IntTycon.ty) ([1, 2], [3, 4])

one first runs a (side-effect-free) computation to build the equality
function and then a (possibly side-effecting) computation to compute
equality.

While I like using side-effect-free computations when I can, I do get
nervous about restricting side effects via the type system.  One could
shoot oneself in the foot and rule out something useful.  I haven't
yet seen a use for the side-effecting computation of a tiv, but I
could imagine, say, side-effecting memoization being useful.

> However, when one wants to build a type-index for an imperative type
> (ref or array) one sometimes really needs to perform a side-effect
> and one needs to be able to control when side-effects occure.
> Consider the show function for instance.  Cycle detection relies on
> performing a side-effect.

I don't get your point here.  The cycle detection in show is not at
tiv construction time, it happens once the value is supplied.  In my
example, Show.apply has the following type.

  val Show.apply: 'a Type.t -> 'a -> string

As with Equals.apply, the first stage application to a type is a
side-effect-free computation.  Only the second stage uses side
effects.  Here is the fragment of my code defining the case for show
for the ref type constructor.

structure Z =
   DefCase1
   (structure Tycon = RefTycon
    structure Value = Show
    fun rule (t, RefRep.T {get, equals, ...}, iso) =
       let
          val elt = Show.apply t
       in
          fn (b, seen) =>
          checkSeen (Iso.inject (iso, b), seen,
                     fn (u1, u2) => equals (Iso.project (iso, u1),
                                            Iso.project (iso, u2)),
                     fn () => elt (get b, seen))
       end)

The first stage of the rule merely make the inductive use of show.
The second stage after the "fn (b, seen) =>" may have side effects via
the checkSeen function.

> > Raw tivs themselves are deceptively simple.  The keep a list ref with
> > one element (a function) for each type constructor that has a rule.
> 
> Doesn't this mean that the compiler will have a very hard time dropping
> unused tis (type-indices) and tivs from the program?  If you define a ti
> for some type, and you have defined tivs for equality, ordering, hashing,
> serialization, random generation, etc..., woudn't something be (almost
> certainly) registered (imperatively added to the list ref) for each of
> those tivs whether or not you actually use all of those tivs at the
> specific ti?

The definition of a ti does not add anything to any list ref.  Nor
does the definition of a tiv.  Only the definition of an inductive
rule (via DefCase) adds to a list ref, namely it adds a case for the
ti to the list ref of the tiv.

This is just the usual two-dimensional-table-lookup problem.  In this
case, for the global collection of rules we want to represent a
partial function

  lookup: TIV x TI -> RULE

so that we can lookup a rule when presented with a particular tiv and
ti.  The lookup function can be visualized as a two-dimensional table
in which the rows are tivs and the columns are tis.  One can represent
the table in one of three ways

  1. A single data structure capable of looking up a (tiv, ti) pair.
  2. A data structure for each row (tiv), holding information for all
     of the columns (tis) for which lookup (tiv, ti) is defined. 
  3. A data structure for each column (ti), holding information for
     all of of rows (tivs) for which lookup (tiv, ti) is defined.

Each of these presents a potential space leak, and can hold on to
table entries that are unusable.

In the particular case at hand, I thought that (2) was the best
choice, since it allows one to collect unused tivs and the rules that
define them.  The only space leak is that one can hold on to inductive
rules for tivs that are in use but for tis that are not.  I don't feel
that is a problem in practice, especially given the static nature of
what's going on -- type indexes correspond to actual types in the
program.

There are techniques that help with the two-dimensional-table-lookup
problem (note that weak pointers do not by themselves solve the
problem).  One is to expose an API to remove table entries.  An
"UndefCase" functor would be quite amusing :-).  Another is to use
finalizer to clean up the rows corresponding to a column when that
column gets collected.

> [...and later...]
> > This note uses only constant-time wrappers and unwrappers, and handles
> > mutable values smoothly.
> [...]
> 
> I'd like to better understand what (if any) overhead the approach implies.
> Could you elaborate on it?

There is overhead due to the two kinds of wrapping/coercion going on:
of tivs and of values.

For tivs, each tiv defines a way to lift an isomorphism to coerce back
and forth between its externally visible type and its underlying
representation.  For example, the equality tiv has type

  'a Type.t -> 'a * 'a -> bool

but underneath is represented by a function of type

  Univ.t * Univ.t -> bool

When equality is applied to some type "t Type.t", it converts this
function by lifting the isomorphism between type t and type Univ.t.
The effect is to create a function that takes a pair of arguments,
applies the injection from type t to the universal type, and calls the
underlying equality function.

In short, application of a tiv coerces by descending the type
structure of the tiv and applying a constant time wrap or unwrap to
each tyvar (depending on whether the position of the tyvar is
covariant or contravariant).

The second kind of wrapping/coercions is that used to provide a
universally-typed interface to polymorphic type constructors.  For
example, lists of all types are represented by a value of type Univ.t,
and each particular list type carries along an isomorphism of type

  (Univ.t, (Univ.t, Univ.t) ListTyconRep.t) Iso.t

that allows one to program generically with lists.

For example, a polymorphic length function on lists would look like

  fun length i, l) = 
     recur ((l, 0), fn ((l, n), loop) =>
            case Iso.project (i, l) of
               Nil => n
             | Cons (_, l) => loop (l, n + 1))

At every iteration of the loop, the projection is applied to the list.
If, as is likely, there are many types of lists, the projection will
involve a case dispatch on the type of the list, followed by a
construction of Nil or Cons, as appropriate, and probably involving
allocation.  The code that actually does the projection is
ListTycon.isorec:

    fun isorec () = 
       Iso.make (fn Cons (x, l) => x :: l | Nil => [],
                 fn [] => Nil | x :: l => Cons (x, l)))

So, there is a constant-factor overhead to generic functions when
compared to their non-generic counterparts.  This overhead is due to
the fact that, underneath, a generic function is not specialized -- it
only deals with universal types.  Because MLton specializes type
representations, there is an unavoidable cost to mitigate between the
specialized representation and the generic code, unless we could
convince MLton to do just the right duplicate of generic functions,
just as it does for polymorphic functions.

There is also, of course, the (currently linear) lookup of inductive
rules but as that happens at tiv construction time, I didn't bother to
optimize it.  One could use a hash table if there are going to be
large numbers of type constructors.

> > In the value-independent style, one uses a single family of types and
> > allows as many tivs as one wants to be defined for the family.  This
> > is a superior approach to the value-dependent approach because it
> > easier to add new tivs and because tivs are automatically composable
> > -- one isn't forced to collect together the tivs into a product.
> 
> It is better when compared in that particular aspect.  However, as you
> later note
> 
> > Yang's approach also uses deep coercion functions between ordinary SML
> > values and values of his universal type, which take time proportional
> > to the size of the data structure being coerced. This is inefficient
> > and doesn't work with mutable values. [...]
> 
> all is not rosy.  This is one of the reasons why I think that Yang's
> technique, while interesting from a theoretical pov, is not really
> practical (or at least not suitable for all tivs).

I agree that Yang's implementation of the value-independent approach
is not practical, due to problems with both modularity and
performance.  The main point of my note was to show that the
value-independent approach could be made practical, and that one could
have a system with its composability, while avoiding the modularity
and performance problems of Yang's technique.



More information about the MLton-user mailing list