MLton 20241230

Programmers coming from languages that have a return statement, such as C, Java, and Python, often ask how one can translate functions that return early into SML. This page briefly describes a number of ways to translate uses of return to SML.

Conditional iterator function

A conditional iterator function, such as List.find, List.exists, or List.all is probably what you want in most cases. Unfortunately, it might be the case that the particular conditional iteration pattern that you want isn’t provided for your data structure. Usually the best alternative in such a case is to implement the desired iteration pattern as a higher-order function. For example, to implement a find function for arrays (which already exists as Array.find) one could write

fun find predicate array = let
   fun loop i =
       if i = Array.length array then
          NONE
       else if predicate (Array.sub (array, i)) then
          SOME (Array.sub (array, i))
       else
          loop (i+1)
in
   loop 0
end

Of course, this technique, while probably the most common case in practice, applies only if you are essentially iterating over some data structure.

Escape handler

Probably the most direct way to translate code using return statements is to basically implement return using exception handling. The mechanism can be packaged into a reusable module with the signature (exit.sig):

(**
 * Signature for exit (or escape) handlers.
 *
 * Note that the implementation necessarily uses exception handling.  This
 * is to make proper resource handling possible.  Exceptions raised by the
 * implementation can be caught by wildcard exception handlers.  Wildcard
 * exception handlers should generally reraise exceptions after performing
 * their effects.
 *)
signature EXIT = sig
   type 'a t
   (** The type of exits. *)

   val within : ('a t, 'a) CPS.t
   (**
    * Sets up an exit and passes it to the given function.  The function
    * may then return normally or by calling {to} with the exit and a
    * return value.  For example,
    *
    *> Exit.within
    *>    (fn l =>
    *>        if condition then
    *>           Exit.to l 1
    *>        else
    *>           2)
    *
    * evaluates either to {1} or to {2} depending on the {condition}.
    *
    * Note that the function receiving the exit is called from a non-tail
    * position.
    *)

   val to : 'a t -> 'a -> 'b
   (**
    * {to l v} returns from the {within} invocation that introduced the
    * exit {l} with the value {v}.  Evaluating {to l v} outside of the
    * {within} invocation that introduced {l} is a programming error and
    * raises an exception.
    *
    * Note that the type variable {'b} only appears as the return type.
    * This means that {to} doesn't return normally to the caller and can
    * be called from a context of any type.
    *)

   val call : ('a -> 'b, 'a) CPS.t
   (**
    * Simpler, but less flexibly typed, interface to {within} and {to}.
    * Specifically, {call f} is equivalent to {within (f o to)}.
    *)
end

(Typing First-Class Continuations in ML discusses the typing of a related construct.) The implementation (exit.sml) is straightforward:

structure Exit :> EXIT = struct
   type 'a t = 'a -> exn

   fun within block = let
      exception EscapedExit of 'a
   in
      block EscapedExit
      handle EscapedExit value => value
   end

   fun to exit value = raise exit value

   fun call block = within (block o to)
end

Here is an example of how one could implement a find function given an app function:

fun appToFind (app : ('a -> unit) -> 'b -> unit)
              (predicate : 'a -> bool)
              (data : 'b) =
    Exit.call
       (fn return =>
           (app (fn x =>
                    if predicate x then
                       return (SOME x)
                    else
                       ())
                data
          ; NONE))

In the above, as soon as the expression predicate x evaluates to true the app invocation is terminated.

Continuation-passing Style (CPS)

A general way to implement complex control patterns is to use CPS. In CPS, instead of returning normally, functions invoke a function passed as an argument. In general, multiple continuation functions may be passed as arguments and the ordinary return continuation may also be used. As an example, here is a function that finds the leftmost element of a binary tree satisfying a given predicate:

datatype 'a tree = LEAF | BRANCH of 'a tree * 'a * 'a tree

fun find predicate = let
   fun recurse continue =
       fn LEAF =>
          continue ()
        | BRANCH (lhs, elem, rhs) =>
          recurse
             (fn () =>
                 if predicate elem then
                    SOME elem
                 else
                    recurse continue rhs)
             lhs
in
   recurse (fn () => NONE)
end

Note that the above function returns as soon as the leftmost element satisfying the predicate is found.