MLton 20241230

Here are a number of syntactic conventions useful for programming in SML.

General

  • A line of code never exceeds 80 columns.

  • Only split a syntactic entity across multiple lines if it doesn’t fit on one line within 80 columns.

  • Use alphabetical order wherever possible.

  • Avoid redundant parentheses.

  • When using :, there is no space before the colon, and a single space after it.

Identifiers

  • Variables, record labels and type constructors begin with and use small letters, using capital letters to separate words.

    cost
    maxValue
  • Variables that represent collections of objects (lists, arrays, vectors, …​) are often suffixed with an s.

    xs
    employees
  • Constructors, structure identifiers, and functor identifiers begin with a capital letter.

    Queue
    LinkedList
  • Signature identifiers are in all capitals, using _ to separate words.

    LIST
    BINARY_HEAP

Types

  • Alphabetize record labels. In a record type, there are spaces after colons and commas, but not before colons or commas, or at the delimiters { and }.

    {bar: int, foo: int}
  • Only split a record type across multiple lines if it doesn’t fit on one line. If a record type must be split over multiple lines, put one field per line.

    {bar: int,
     foo: real * real,
     zoo: bool}
  • In a tuple type, there are spaces before and after each *.

    int * bool * real
  • Only split a tuple type across multiple lines if it doesn’t fit on one line. In a tuple type split over multiple lines, there is one type per line, and the *-s go at the beginning of the lines.

    int
    * bool
    * real

    It may also be useful to parenthesize to make the grouping more apparent.

    (int
     * bool
     * real)
  • In an arrow type split over multiple lines, put the arrow at the beginning of its line.

    int * real
    -> bool

    It may also be useful to parenthesize to make the grouping more apparent.

    (int * real
     -> bool)
  • Avoid redundant parentheses.

  • Arrow types associate to the right, so write

    a -> b -> c

    not

    a -> (b -> c)
  • Type constructor application associates to the left, so write

    int ref list

    not

    (int ref) list
  • Type constructor application binds more tightly than a tuple type, so write

    int list * bool list

    not

    (int list) * (bool list)
  • Tuple types bind more tightly than arrow types, so write

    int * bool -> real

    not

    (int * bool) -> real

Core

  • A core expression or declaration split over multiple lines does not contain any blank lines.

  • A record field selector has no space between the # and the record label. So, write

    #foo

    not

    # foo
  • A tuple has a space after each comma, but not before, and not at the delimiters ( and ).

    (e1, e2, e3)
  • A tuple split over multiple lines has one element per line, and the commas go at the end of the lines.

    (e1,
     e2,
     e3)
  • A list has a space after each comma, but not before, and not at the delimiters [ and ].

    [e1, e2, e3]
  • A list split over multiple lines has one element per line, and the commas at the end of the lines.

    [e1,
     e2,
     e3]
  • A record has spaces before and after =, a space after each comma, but not before, and not at the delimiters { and }. Field names appear in alphabetical order.

    {bar = 13, foo = true}
  • A sequence expression has a space after each semicolon, but not before.

    (e1; e2; e3)
  • A sequence expression split over multiple lines has one expression per line, and the semicolons at the beginning of lines. Lisp and Scheme programmers may find this hard to read at first.

    (e1
     ; e2
     ; e3)

    Rationale: this makes it easy to visually spot the beginning of each expression, which becomes more valuable as the expressions themselves are split across multiple lines.

  • An application expression has a space between the function and the argument. There are no parens unless the argument is a tuple (in which case the parens are really part of the tuple, not the application).

    f a
    f (a1, a2, a3)
  • Avoid redundant parentheses. Application associates to left, so write

    f a1 a2 a3

    not

    ((f a1) a2) a3
  • Infix operators have a space before and after the operator.

    x + y
    x * y - z
  • Avoid redundant parentheses. Use OperatorPrecedence. So, write

    x + y * z

    not

    x + (y * z)
  • An andalso expression split over multiple lines has the andalso at the beginning of subsequent lines.

    e1
    andalso e2
    andalso e3
  • A case expression is indented as follows

    case e1 of
       p1 => e1
     | p2 => e2
     | p3 => e3
  • A datatype’s constructors are alphabetized.

    datatype t = A | B | C
  • A datatype declaration has a space before and after each |.

    datatype t = A | B of int | C
  • A datatype split over multiple lines has one constructor per line, with the | at the beginning of lines and the constructors beginning 3 columns to the right of the datatype.

    datatype t =
       A
     | B
     | C
  • A fun declaration may start its body on the subsequent line, indented 3 spaces.

    fun f x y =
       let
          val z = x + y + z
       in
          z
       end
  • An if expression is indented as follows.

    if e1
       then e2
    else e3
  • A sequence of if-then-else-s is indented as follows.

    if e1
       then e2
    else if e3
       then e4
    else if e5
       then e6
    else e7
  • A let expression has the let, in, and end on their own lines, starting in the same column. Declarations and the body are indented 3 spaces.

    let
       val x = 13
       val y = 14
    in
       x + y
    end
  • A local declaration has the local, in, and end on their own lines, starting in the same column. Declarations are indented 3 spaces.

    local
       val x = 13
    in
       val y = x
    end
  • An orelse expression split over multiple lines has the orelse at the beginning of subsequent lines.

    e1
    orelse e2
    orelse e3
  • A val declaration has a space before and after the =.

    val p = e
  • A val declaration can start the expression on the subsequent line, indented 3 spaces.

    val p =
       if e1 then e2 else e3

Signatures

  • A signature declaration is indented as follows.

    signature FOO =
       sig
          val x: int
       end

    Exception: a signature declaration in a file to itself can omit the indentation to save horizontal space.

    signature FOO =
    sig
    
    val x: int
    
    end

    In this case, there should be a blank line after the sig and before the end.

  • A val specification has a space after the colon, but not before.

    val x: int

    Exception: in the case of operators (like +), there is a space before the colon to avoid lexing the colon as part of the operator.

    val + : t * t -> t
  • Alphabetize specifications in signatures.

    sig
       val x: int
       val y: bool
    end

Structures

  • A structure declaration has a space on both sides of the =.

    structure Foo = Bar
  • A structure declaration split over multiple lines is indented as follows.

    structure S =
       struct
          val x = 13
       end

    Exception: a structure declaration in a file to itself can omit the indentation to save horizontal space.

    structure S =
    struct
    
    val x = 13
    
    end

    In this case, there should be a blank line after the struct and before the end.

  • Declarations in a struct are separated by blank lines.

    struct
       val x =
          let
             y = 13
          in
             y + 1
          end
    
       val z = 14
    end

Functors

  • A functor declaration has spaces after each : (or :>) but not before, and a space before and after the =. It is indented as follows.

    functor Foo (S: FOO_ARG): FOO =
       struct
           val x = S.x
       end

    Exception: a functor declaration in a file to itself can omit the indentation to save horizontal space.

    functor Foo (S: FOO_ARG): FOO =
    struct
    
    val x = S.x
    
    end

    In this case, there should be a blank line after the struct and before the end.