[MLton-user] Re: RFC: Extended Basis Library
Vesa Karvonen
vesa.karvonen at cs.helsinki.fi
Thu Mar 1 07:30:20 PST 2007
Quoting Geoffrey Alan Washburn <geoffw at cis.upenn.edu>:
> Vesa Karvonen wrote:
[...]
> > So, one could use a name like
> >
> > sortByCmp
> >
> > for the generally fastest comparison based sorting function and
> >
> > stableSortByCmp
> >
> > for the sorting function that is as fast as possible while being stable
> > (O(n log n) and preserves relative order of equal elements).
>
> Given these two specifications, which algorithms do you think you
> would select?
For sorting lists, I would use (an optimized) merge sort in stableSort.
For the non-stable sort there may be faster alternatives, but as a first
approximation I would probably also use merge sort. If there is a lot of
free memory and the list is long, then copying to an array and sorting it
with intro sort (variation of quick sort) might be significantly faster.
Note that the misc-util library in mltonlib already has a merge sort implementation:
http://mlton.org/cgi-bin/viewsvn.cgi/mltonlib/trunk/com/ssh/misc-util/unstable/sorted-list.sml?view=auto
http://mlton.org/cgi-bin/viewsvn.cgi/mltonlib/trunk/com/ssh/misc-util/unstable/sorted-list-test.sml?view=auto
> > The main reason I assumed arbitrary precision is that the Word types
> > already work like unsigned integers with modular arithmetic.
>
> Hmm. I hadn't really thought about using words, but that may be
> practical enough for my purposes. Though I imagine someone still might
> like a natural number abstraction.
I think that it would be *nice to have* natural numbers, but I can't say
that I would have really needed them that often. If you do need them,
then I think that it would make a fine addition to the Extended Basis.
> > http://mlton.org/References#WangMurphy .
> [...] Has anyone tried using something like the Fold library
> (http://mlton.org/Fold) to provide nth projections and injections with a
> single function?
That shouldn't be too hard. Looking at page 4 of the article, I think
that something like this should suffice (uncompiled):
fun prjF ? = Fold.fold (prj, id) ?
fun S ? = Fold.step0 prj_succ ?
Instead of
case prjn x of
...
you could then write
__n__
/ \
case prjF S ... S $ x of
...
> > The Monad core signature:
> >
> > signature MONAD_CORE = sig
> > type 'a monad
> > val return : 'a -> 'a monad
> > val >>= : 'a monad * ('a -> 'b monad) -> 'b monad
> > end
> >
> This is essentially the same as what I had come up with for what I
> had called BASIC_MONAD. Though I had chosen to go with the signature
>
> signature BASIC_MONAD = sig
> type 'a t
> val return : 'a -> 'a t
> val >>= : 'a t * ('a -> 'b t) -> 'b t
> end
>
> As using »t« as the type seems to be convention used most other places,
> but with »where type« constraints I don't know that it really matters
> that much.
I think that the t convention is not appropriate for abstract or concept
signatures that may be extended and implemented by many structures. As
explained in
http://mlton.org/pipermail/mlton/2006-December/029469.html
the point of not using t is to make it possible to combine signatures.
Signatures that have a type named t can not be included into a combined
signature:
signature WONT_WORK =
sig
include SIG_WITH_T
include ANOTHER_SIG_WITH_T
end
IMO, using type t in a monad signature would be a mistake.
> > signature MONADP_CORE = sig
> > include MONAD_CORE
> > val zero : 'a monad
> > val plus : 'a monad BinOp.t
> > end
> >
> > I'm not sure if the names zero and plus are the best.
>
> I called them »zero« and ++.
Hmmm... The symbolic ++ and an infix declaration for ++ might be better.
What about associativity and precedence? We should probably reconsider
the precedences of all monadic operators.
[...]
> It might be worth either adding some additional operations to
> MONADP_CORE, or perhaps make things even more elaborate. I have three
> other MONADP operations that cannot (usually) be defined in terms of
> MONADP_CORE.
>
> val fail : 'a monad
> val once : 'a monad -> 'a monad
> val ifte : 'a monad -> ('a -> 'b monad) -> 'b monad -> 'b monad
>
> [...] (http://www.cs.rutgers.edu/~ccshan/logicprog/LogicT-icfp2005.pdf) [...]
In section 4.1 of the article, the authors specify a new type class called
LogicM that inherits from MonadPlus rather than change the MonadPlus
class. I think that any extra primitives should go into another
(extended) signature. The basic MonadPlus class (or MONADP_CORE) is
useful for many things without the extra primitives. Also, from my quick
reading of the article, I get the impression that once and ifte (and
interleave and >>-) can be implemented in terms of msplit + MonadPlus, so
I'm not sure that having once and ifte in a CORE signature makes sense.
> Another variation I use is to actually give »zero« and »fail« arguments
>
> type error
> val zero : error -> 'a monad
> val fail : error -> 'a monad
>
> and then a »where type« constraint is used to specify the nature of
> »error« values. I'm not sure how to fit this in well.
AFAIU, the fail function in Haskell's Monad type class is there mainly for
practical reasons related to the Haskell language and the same reasons do
not apply to SML. In particular, in SML, you can just raise an exception
if you detect an *error* during a computation. OTOH, if you want to have
a monad with more control, e.g. over backtracking, than is offered by
MonadPlus (or MONADP_CORE), then you should use an extension of it,
e.g. LogicM (or, say, LOGICM_CORE). So, I would rather see signatures
LOGICM_CORE, LOGICM_EX and LOGICM rather than have MONADP_CORE changed.
> > In addition to the above, I also have versions of the above signatures
> > with an extra type parameter:
[...]
> It depends upon your application I expect, but while I used to use
> an extra parameter for my state monad signature, I've since switched to
> something that would look like
>
> signature STATE_MONAD = sig
> include MONAD_CORE
> include MONAD_EX where type 'a monad_ex = 'a monad
> type state
> val getState : state monad
> val setState : state -> unit monad
> val run : state -> 'a monad -> 'a monad
> end
Is the type of run correct?
Something like that could also be provided in addition to other monad
signatures. However, I would factor that into combinable signatures.
Something like this:
signature MONAD_WS = sig (* WS = WITH_STATE *)
type 'a monad_ws
type monad_ws_state
val getState : monad_ws_state monad_ws
val setState : monad_ws_state -> Unit.t monad_ws
val run : monad_ws_state -> 'a monad_ws UnOp.t
end
signature MONAD_STATE = sig
include MONAD
include MONAD_WS where type 'a monad_ws = 'a monad
end
Doing it like this means that you can then combine MONAD_WS with
MONADP_CORE
signature MONADP_STATE = sig
include MONADP
include MONAD_WS where type 'a monad_ws = 'a monad
end
and LOGICM
signature LOGICM_STATE = sig
include LOGICM
include MONAD_WS where type 'a monad_ws = 'a monad
end
> > Well, I think that there are too few SML libraries. I think that lazy
> > lists are something that people use often enough that it makes sense to
> > provide a reasonable implementation in the Extended Basis library.
> >
> Sounds reasonable. Given that it is probably worth considering some
> additional container signatures. Sets and maps at the very least.
I agree.
-Vesa Karvonen
More information about the MLton-user
mailing list