XML is an IntermediateLanguage, translated from CoreML by Defunctorize, optimized by XMLSimplify, and translated by Monomorphise to SXML.
Description
XML is polymorphic, higher-order, with flat patterns. Every
XML expression is annotated with its type. Polymorphic
generalization is made explicit through type variables annotating
val
and fun
declarations. Polymorphic instantiation is made
explicit by specifying type arguments at variable references. XML
patterns can not be nested and can not contain wildcards, constraints,
flexible records, or layering.
Implementation
Type Checking
XML also has a type checker, used for debugging. At present, the
type checker is also the best specification of the type system of
XML. If you need more details, the type checker
(type-check.sig
,
type-check.fun
), is pretty short.
Since the type checker does not affect the output of the compiler (unless it reports an error), it can be turned off. The type checker recursively descends the program, checking that the type annotating each node is the same as the type synthesized from the types of the expressions subnodes.
Details and Notes
XML uses the same atoms as CoreML, hence all identifiers (constructors, variables, etc.) are unique and can have properties attached to them. Finally, XML has a simplifier (XMLShrink), which implements a reduction system.
Types
XML types are either type variables or applications of n-ary type constructors. There are many utility functions for constructing and destructing types involving built-in type constructors.
A type scheme binds list of type variables in a type. The only interesting operation on type schemes is the application of a type scheme to a list of types, which performs a simultaneous substitution of the type arguments for the bound type variables of the scheme. For the purposes of type checking, it is necessary to know the type scheme of variables, constructors, and primitives. This is done by associating the scheme with the identifier using its property list. This approach is used instead of the more traditional environment approach for reasons of speed.
XmlTree
Before defining XML
, the signature for language XML, we need to
define an auxiliary signature XML_TREE
, that contains the datatype
declarations for the expression trees of XML. This is done solely
for the purpose of modularity — it allows the simplifier and type
checker to be defined by separate functors (which take a structure
matching XML_TREE
). Then, Xml
is defined as the signature for a
module containing the expression trees, the simplifier, and the type
checker.
Both constructors and variables can have type schemes, hence both constructor and variable references specify the instance of the scheme at the point of references. An instance is specified with a vector of types, which corresponds to the type variables in the scheme.
XML patterns are flat (i.e. not nested). A pattern is a
constructor with an optional argument variable. Patterns only occur
in case
expressions. To evaluate a case expression, compare the
test value sequentially against each pattern. For the first pattern
that matches, destruct the value if necessary to bind the pattern
variables and evaluate the corresponding expression. If no pattern
matches, evaluate the default. All patterns of a case statement are
of the same variant of Pat.t
, although this is not enforced by ML’s
type system. The type checker, however, does enforce this. Because
tuple patterns are irrefutable, there will only ever be one tuple
pattern in a case expression and there will be no default.
XML contains value, exception, and mutually recursive function
declarations. There are no free type variables in XML. All type
variables are explicitly bound at either a value or function
declaration. At some point in the future, exception declarations may
go away, and exceptions may be represented with a single datatype
containing a unit ref
component to implement genericity.
XML expressions are like those of CoreML, with the following exceptions. There are no records expressions. After type inference, all records (some of which may have originally been tuples in the source) are converted to tuples, because once flexible record patterns have been resolved, tuple labels are superfluous. Tuple components are ordered based on the field ordering relation. XML eta expands primitives and constructors so that there are always fully applied. Hence, the only kind of value of arrow type is a lambda. This property is useful for flow analysis and later in code generation.
An XML program is a list of toplevel datatype declarations and a body expression. Because datatype declarations are not generative, the defunctorizer can safely move them to toplevel.