[MLton] Extending the SML Basis library (Library project)

Vesa Karvonen vesa.karvonen at cs.helsinki.fi
Tue Oct 10 00:14:04 PDT 2006


Perhaps the first things that we could do is to start a library that provides
some (minor) extensions to the basis library.  As a reminder, here is how I
described it earlier:

  "An extended Basis library with some minor extensions to the library
  signatures and structures.  These are done in a non-intrusive manner by
  simply rebinding the signatures and structures.  The reason for
  extending the Basis library is that the extensions naturally belong to
  specific basis library modules.  Extensions include things like
  isomorphisms and embeddings (pairs of the form (toX, fromX)) and bounds
  (pairs of the form (minX, maxX))."

And here is a bullet from Stephen's reply:

> * I don't think we should be hamstrung by basis library compatibility.
>   I am fine with producing our own replacements for basis library
>   modules that are not compatible with the basis.  With MLBs, it is
>   easy for people to mix and match to get what they need.  In the long
>   run, I think it would be best if people could simply use
>
>     $(SML_LIB)/mltonlib/mltonlib.mlb
>
>   and get enough libraries to write useful code.

Well, I think that the basis library is a valuable library and it doesn't
make sense to throw it away.  In particular, there is a book describing
the basis library and people just learning SML are likely to spend time
learning the basis library.  IMO, it makes sense to build on that
knowledge.

However, I agree that maintaining 100% basis library compatibility is
unlikely to lead to an "optimal" design.  In particular, here is what the
basis library book says (page 11, start of section 2, emphasis added):

  "We view the signature and structure names used below as being
  *reserved*.  For an implementation to be conforming, any module it
  provides that is named in the SML Basis Library must *exactly* match the
  description specified in the Library."

So, the design of the basis library is supposed to be more or less cast in
stone (at least if you want to claim that you've implemented the SML Basis
Library).  However, the way I see it, the basis library contains an
organizational framework that goes beyond the exact signatures and
structures specified.  I think that for many simple extensions there is a
place in that organizational framework, and while it isn't technically
necessary to extended the basis library, it makes sense to do so because it
can reduce the learning curve and make the entirety easier to use.

On the other hand, I don't think that everything should be put into such
an extended basis library.  As a rule of thumb, I'd say that things that
naturally belong (fuzzy, yes) to specific basis library modules and what
those things depend on should go into such an extended basis lib.
Everything else, even if looks like stuff that could be in a basis lib,
but there is no module in *the* basis lib for it, should go into other
libraries.

At the end of this message is code extracted from our proposed extended
basis lib defining the extended signatures (except for the TEXT signature
which is just a redefinition using the extended substructure signatures).
Note that the signatures aren't meant to be cast in stone.  I think there
are many more extensions (and minor changes) that would probably be best
expressed as "deltas" to specific basis library modules.

Here is some brief rationale for the particular extensions (in addition to
the fact that we're using these in our other libs):

- The isomorphism, embedding, and bound pairs are added because the
modules already contain their components, but not the pairs.  Having the
pairs is convenient for a number of things (such as for building
conversions back and forth at the same time, and in conjunction with
things like type-indexed functions).

- The types of isomorphisms and embeddings are not made abstract, because
value restriction would then prevent making polymorphic isomorphisms and
embeddings.  Alternatively to distinguish them for arbitrary pairs, they
could be specified as datatypes of the form

  datatype ('a, 'b) iso = ISO of ('a -> 'b) * ('b -> 'a)
  datatype ('a, 'b) emb = EMB of ('a -> 'b) * ('b -> 'a option)

and with minimal associated modules.

- The functions is0, isEven, isOdd, toList, etc... are just little
functions that are often handy and it makes sense not to reinvent them
repeatedly.

- The STRING signature is extend to become a full MONO_VECTOR.  At least
I've found it annoying that one has to use both String and CharVector to
manipulate strings.  It might make sense to clean up the STRING signature.

Like I said, the signatures below are not meant to be cast in stone.  For
example, MLton's implementation of the basis library contains many extensions
that could be added/exposed as well.

So, if a library like this seems worth having, it could go into

  mltonlib/extended-basis/unstable .

I look forward to your critique, thoughs, and feedback.

--Vesa Karvonen


type ('a, 'b) iso = ('a -> 'b) * ('b -> 'a)
type ('a, 'b) emb = ('a -> 'b) * ('b -> 'a option)

signature GENERAL =
   sig
      include GENERAL
      type ('a, 'b) iso = ('a, 'b) iso
      type ('a, 'b) emb = ('a, 'b) emb
   end

signature ARRAY =
   sig
      include ARRAY
      val list : ('a array, 'a list) iso
      val toList : 'a array -> 'a list
   end

signature CHAR =
   sig
      include CHAR
      val int : (char, Int.int) iso
      val minOrd : Int.int
      val boundsChar : char * char
      val boundsOrd : Int.int * Int.int
   end

signature INTEGER =
   sig
      include INTEGER
      val int : (int, Int.int) iso
      val large : (int, LargeInt.int) iso
      val string : (int, string) emb
      val is0 : int -> bool
      val isEven : int -> bool
      val isOdd : int -> bool
      val bounds : (int * int) option
   end

signature INT_INF =
   sig
      include INT_INF
      val int : (int, Int.int) iso
      val large : (int, LargeInt.int) iso
      val string : (int, string) emb
      val is0 : int -> bool
      val isEven : int -> bool
      val isOdd : int -> bool
      val bounds : (int * int) option
   end

signature MONO_ARRAY =
   sig
      include MONO_ARRAY
      val list : (array, elem list) iso
      val toList : array -> elem list
   end

signature MONO_VECTOR =
   sig
      include MONO_VECTOR
      val list : (vector, elem list) iso
      val toList : vector -> elem list
   end

signature REAL =
   sig
      include REAL
      val decimal : (real, IEEEReal.decimal_approx) emb
      val int : (Int.int, real) iso
      val large : (real, LargeReal.real) iso
      val largeInt : (LargeInt.int, real) iso
      val manExp : (real, {man : real, exp : int}) iso
      val string : (real, string) emb
   end

signature STRING =
   sig
      include STRING
      val list : (string, char list) iso
      val cString : (string, string) emb
      val string : (string, string) emb

      type vector = string
      type elem = char

      val all : (elem -> bool) -> vector -> bool
      val app  : (elem -> unit) -> vector -> unit
      val appi : (int * elem -> unit) -> vector -> unit
      val exists : (elem -> bool) -> vector -> bool
      val find  : (elem -> bool) -> vector -> elem option
      val findi : (int * elem -> bool) -> vector -> (int * elem) option
      val foldl  : (elem * 'a -> 'a) -> 'a -> vector -> 'a
      val foldli : (int * elem * 'a -> 'a) -> 'a -> vector -> 'a
      val foldr  : (elem * 'a -> 'a) -> 'a -> vector -> 'a
      val foldri : (int * elem * 'a -> 'a) -> 'a -> vector -> 'a
      val fromList : elem list -> vector
      val length : vector -> int
      val mapi : (int * elem -> elem) -> vector -> vector
      val maxLen : int
      val tabulate : int * (int -> elem) -> vector
      val toList : vector -> elem list
      val update : vector * int * elem -> vector
   end

signature VECTOR =
   sig
      include VECTOR
      val list : ('a vector, 'a list) iso
      val toList : 'a vector -> 'a list
   end

signature WORD =
   sig
      include WORD
      val toWord : word -> Word.word
      val fromWord : Word.word -> word
      val int : (word, Int.int) iso
      val intX : (word, Int.int) iso
      val large : (word, LargeWord.word) iso
      val largeInt : (word, LargeInt.int) iso
      val largeIntX : (word, LargeInt.int) iso
      val largeX : (word, LargeWord.word) iso
      val word : (word, Word.word) iso
      val string : (word, string) emb
      val is0 : word -> bool
      val isEven : word -> bool
      val isOdd : word -> bool
      val minWord : word
      val maxWord : word
      val bounds : word * word
   end



More information about the MLton mailing list