MLton 20241230

AST is the IntermediateLanguage produced by the FrontEnd and translated by Elaborate to CoreML.

Description

The abstract syntax tree produced by the FrontEnd.

Implementation

Type Checking

The AST IntermediateLanguage has no independent type checker. Type inference is performed on an AST program as part of Elaborate.

Details and Notes

Source locations

MLton makes use of a relatively clean method for annotating the abstract syntax tree with source location information. Every source program phrase is "wrapped" with the WRAPPED interface:

signature WRAPPED =
   sig
      type node'
      type obj

      val dest: obj -> node' * Region.t
      val makeRegion': node' * SourcePos.t * SourcePos.t -> obj
      val makeRegion: node' * Region.t -> obj
      val node: obj -> node'
      val region: obj -> Region.t
   end

The key idea is that node' is the type of an unannotated syntax phrase and obj is the type of its annotated counterpart. In the implementation, every node' is annotated with a Region.t (region.sig, region.sml), which describes the syntax phrase’s left source position and right source position, where SourcePos.t (source-pos.sig, source-pos.sml) denotes a particular file, line, and column. A typical use of the WRAPPED interface is illustrated by the following code:

datatype node =
   App of {con: Longcon.t, arg: t, wasInfix: bool}
 | Const of Const.t
 | Constraint of t * Type.t
 | FlatApp of t vector
 | Layered of {constraint: Type.t option,
               fixop: Fixop.t,
               pat: t,
               var: Var.t}
 | List of t vector
 | Paren of t
 | Or of t vector
 | Record of {flexible: bool,
              items: (Record.Field.t * Region.t * Item.t) vector}
 | Tuple of t vector
 | Var of {fixop: Fixop.t,
           name: Longvid.t}
 | Vector of t vector
 | Wild

include WRAPPED sharing type node' = node
                sharing type obj = t

Thus, AST nodes are cleanly separated from source locations. By way of contrast, consider the approach taken by SML/NJ (and also by the CKit Library). Each datatype denoting a syntax phrase dedicates a special constructor for annotating source locations:

datatype pat = WildPat                             (* empty pattern *)
             | AppPat of {constr:pat,argument:pat} (* application *)
             | MarkPat of pat * region             (* mark a pattern *)

The main drawback of this approach is that static type checking is not sufficient to guarantee that the AST emitted from the front-end is properly annotated.