MLton 20241230

The purpose of successor ML, or sML for short, is to provide a vehicle for the continued evolution of ML, using Standard ML as a starting point. The intention is for successor ML to be a living, evolving dialect of ML that is responsive to community needs and advances in language design, implementation, and semantics.

SuccessorML Features in MLton

The following SuccessorML features have been implemented in MLton. The features are disabled by default, and may be enabled utilizing the feature’s corresponding ML Basis annotation which is listed directly after the feature name. In addition, the allowSuccessorML {false|true} annotation can be used to simultaneously enable all of the features.

  • do Declarations: allowDoDecls {false|true}

    Allow a do exp declaration form, which evaluates exp for its side effects. The following example uses a do declaration:

    do print "Hello world.\n"

    and is equivalent to:

    val () = print "Hello world.\n"
  • Extended Constants: allowExtendedConsts {false|true}

    Allow or disallow all of the extended constants features. This is a proxy for all of the following annotations.

    • Extended Numeric Constants: allowExtendedNumConsts {false|true}

      Allow underscores as a separator in numeric constants and allow binary integer and word constants.

      Underscores in a numeric constant must occur between digits and consecutive underscores are allowed.

      Binary integer constants use the prefix 0b and binary word constants use the prefix 0wb.

      The following example uses extended numeric constants (although it may be incorrectly syntax highlighted):

      val pb = 0b10101
      val nb = ~0b10_10_10
      val wb = 0wb1010
      val i = 4__327__829
      val r = 6.022_140_9e23
    • Extended Text Constants: allowExtendedTextConsts {false|true}

      Allow characters with integer codes ≥ 128 and ≤ 247 that correspond to syntactically well-formed UTF-8 byte sequences in text constants.

      Any 1, 2, 3, or 4 byte sequence that can be properly decoded to a binary number according to the UTF-8 encoding/decoding scheme is allowed in a text constant (but invalid sequences are not explicitly rejected) and denotes the corresponding sequence of characters with integer codes ≥ 128 and ≤ 247. This feature enables "UTF-8 convenience" (but not comprehensive Unicode support); in particular, it allows one to copy text from a browser and paste it into a string constant in an editor and, furthermore, if the string is printed to a terminal, then will (typically) appear as the original text. The following example uses UTF-8 byte sequences:

      val s1 : String.string = "\240\159\130\161"
      val s2 : String.string = "🂡"
      val _ = print ("s1 --> " ^ s1 ^ "\n")
      val _ = print ("s2 --> " ^ s2 ^ "\n")
      val _ = print ("String.size s1 --> " ^ Int.toString (String.size s1) ^ "\n")
      val _ = print ("String.size s2 --> " ^ Int.toString (String.size s2) ^ "\n")
      val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n")

      and, when compiled and executed, will display:

      s1 --> 🂡
      s2 --> 🂡
      String.size s1 --> 4
      String.size s2 --> 4
      s1 = s2 --> true

      Note that the String.string type corresponds to any sequence of 8-bit values, including invalid UTF-8 sequences; hence the string constant "\192" (a UTF-8 leading byte with no UTF-8 continuation byte) is valid. Similarly, the Char.char type corresponds to a single 8-bit value; hence the char constant #"α" is not valid, as the text constant "α" denotes a sequence of two 8-bit values.

  • Line Comments: allowLineComments {false|true}

    Allow line comments beginning with the token (*). The following example uses a line comment:

    (*) This is a line comment

    Line comments properly nest within block comments. The following example uses line comments nested within block comments:

    (*
    val x = 4 (*) This is a line comment
    *)
    
    (*
    val y = 5 (*) This is a line comment *)
    *)
  • Optional Pattern Bars: allowOptBar {false|true}

    Allow a bar to appear before the first match rule of a case, fn, or handle expression, allow a bar to appear before the first function-value binding of a fun declaration, and allow a bar to appear before the first constructor binding or description of a datatype declaration or specification. The following example uses leading bars in a datatype declaration, a fun declaration, and a case expression:

    datatype t =
      | C
      | B
      | A
    
    fun
      | f NONE = 0
      | f (SOME t) =
         (case t of
            | A => 1
            | B => 2
            | C => 3)

    By eliminating the special case of the first element, this feature allows for simpler refactoring (e.g., sorting the lines of the datatype declaration’s constructor bindings to put the constructors in alphabetical order).

  • Optional Semicolons: allowOptSemicolon {false|true}

    Allow a semicolon to appear after the last expression in a sequence or let-body expression. The following example uses a trailing semicolon in the body of a let expression:

    fun h z =
      let
        val x = 3 * z
      in
         f x ;
         g x ;
      end

    By eliminating the special case of the last element, this feature allows for simpler refactoring.

  • Disjunctive (Or) Patterns: allowOrPats {false|true}

    Allow disjunctive (a.k.a., "or") patterns of the form pat1 | pat2, which matches a value that matches either pat1 or pat2. Disjunctive patterns have lower precedence than as patterns and constraint patterns, much as orelse expressions have lower precedence than andalso expressions and constraint expressions. Both sub-patterns of a disjunctive pattern must bind the same variables with the same types. The following example uses disjunctive patterns:

    datatype t = A of int | B of int | C of int | D of int * int | E of int * int
    
    fun f t =
      case t of
         A x | B x | C x => x + 1
       | D (x, _) | E (_, x) => x * 2
  • Record Punning Expressions: allowRecordPunExps {false|true}

    Allow record punning expressions, whereby an identifier vid as an expression row in a record expression denotes the expression row vid = vid (i.e., treating a label as a variable). The following example uses record punning expressions (and also record punning patterns):

    fun incB r =
      case r of {a, b, c} => {a, b = b + 1, c}

    and is equivalent to:

    fun incB r =
      case r of {a = a, b = b, c = c} => {a = a, b = b + 1, c = c}
  • withtype in Signatures: allowSigWithtype {false|true}

    Allow withtype to modify a datatype specification in a signature. The following example uses withtype in a signature (and also withtype in a declaration):

    signature STREAM =
      sig
        datatype 'a u = Nil | Cons of 'a * 'a t
        withtype 'a t = unit -> 'a u
      end
    structure Stream : STREAM =
      struct
        datatype 'a u = Nil | Cons of 'a * 'a t
        withtype 'a t = unit -> 'a u
      end

    and is equivalent to:

    signature STREAM =
      sig
        datatype 'a u = Nil | Cons of 'a * (unit -> 'a u)
        type 'a t = unit -> 'a u
      end
    structure Stream : STREAM =
      struct
        datatype 'a u = Nil | Cons of 'a * (unit -> 'a u)
        type 'a t = unit -> 'a u
      end
  • Vector Expressions and Patterns: allowVectorExpsAndPats {false|true}

    Allow or disallow vector expressions and vector patterns. This is a proxy for all of the following annotations.

    • Vector Expressions: allowVectorExps {false|true}

      Allow vector expressions of the form #[exp0, exp1, …​, expn-1] (where n ≥ 0). The expression has type Ï„ vector when each expression expi has type Ï„.

    • Vector Patterns: allowVectorPats {false|true}

      Allow vector patterns of the form #[pat0, pat1, …​, patn-1] (where n ≥ 0). The pattern matches values of type Ï„ vector when each pattern pati matches values of type Ï„.