[MLton] POPL review
Matthew Fluet
fluet@cs.cornell.edu
Thu, 19 Jan 2006 10:03:54 -0500 (EST)
I thought I would pass along a few impressions from POPL that would be of
interest to the MLton list.
* Invited Talk: The Scala Experiment -- Can We Provide Better Language
Support for Component Systems? (M. Odersky)
http://www.cs.princeton.edu/~dpw/popl/06/odersky.pdf
The Scala language (http://scala.epfl.ch) is one of these multi-paradigm
languages that attempts to combine OO with functional, with a dose of
native XML support. But, Martin's real focus in the invited talk was
dealing with component systems -- how do you build complex software
systems that are robust in the face of changing out one componet for
another. While there is a fairly expressive type system living in the
language, I'm not entirely sold on whether or not this leads to better
software engineering.
* Formal Certification of a Compiler Back-end or: Programming a Compiler
with a Proof Assistant (X. Leroy)
Xavier's herculean accomplishment of getting a formal certification for
the compilation of Cminor (a stripped down C) to PowerPC assembly. What I
find most compelling about this work is that it _doesn't_ follow the
"proof by construction" paradigm that lies at the heart of constructive
type theories, such as supports the Coq proof assistant. Rather than
proving \forall\exists properties and extracting functions, Xavier simply
wrote down the functions he was interested in and proved they had the
required properties. And, where that proved (pardon the pun) too
difficult, he resorted to proving, for example, that a _validation
function_ could assert that a given register allocation did indeed color
the interference graph correctly, without proving that the George-Appel
register allocator itself was correct. This gives you more conditional
theorems (if the compiler produces assembly, then it is semantically
equivalent to the source --- but no guarantees that the compiler actualy
succeeds in producing assembly).
* Environment Analysis via \Delta-CFA (M. Might, O. Shivers)
As I said in my message to Ben Chambers, this was one of the smoothest
talks I saw. The Super-Beta inlining optimization requires that the
call-site have the same dynamic environment as the closure to be inlined.
This environment analysis is fundamentally different than the
control-flow analysis of k-CFA. An open question for me is whether this
kind of analysis might be useful for guiding better closure
representations in MLton.
* Invited Talk: The Next Mainstream Programming Language: A Game
Developer's Perspective (T. Sweeney)
http://www.cs.princeton.edu/~dpw/popl/06/Tim-POPL.ppt
I think the big take away message is that academia doesn't give industry
enough credit. In particular, programmers are not blind to the
complexities and predisposition to bugs inherent in C++ programming.
Also, programmers may be a lot more willing to swallow a radical change in
programming language if it really improves productivity.
* A Verifiable SSA Program Representation for Aggressive Compiler
Optimization (V.S. Menon, N. Glew, B.R. Murphy, A. McCeight, T. Shpeisman,
A.-R. Adl-Tabatabai, L. Petersen)
This is in the context of the STARJit compiler. The big goal is to
optimize the redundant checks that are left by naively translating Java
bytecode into a more explicit representation on the way to native code.
They adopt an SSA representation that is very much like MLton's IL (except
they continue to use \phi-functions and all the complexities that
entails); they add on proof terms and proof types that validate particular
unsafe acceses (like array dereference). The big insight is that by
reifying proof terms as SSA variables, then data-dependence ensures that
all the supporting checks remain in one form or another. The other
advantage is that most optimization passes need not know anything specific
about proof terms -- they treat them just like SSA variables.
* Staged Alocation: A Compositional Technique for Specifying and
Implementing Procedure Calling Conventions (R. Olinsky, C. Lindig, N.
Ramsey)
Implementing calling conventions is hard and error-prone. This work gives
a simple declarative description of calling conventions that can be broken
down in to simple primitives, easily implemented and composed in a
compiler. An interesting point is that rather than finding the
most-expressive language for describing calling conventions, they focus on
primitives that give rise to sensible calling conventions.
===========================================================================
===========================================================================
That was POPL formally, but, of course, there were a lot of interesting
discussions outside the technical programme.
Martin Elsman told me that the MLKit is gearing up for a new release with
both full Standard Basis Library support and ML Basis file support. He
also told me that he's been able to bootstrap the MLKit with MLton, seeing
a decent speed up of the resulting compiler. Unfortunately, the MLton
compiled version of the MLKit weighs in at 100MB (75MB stripped), which is
quite a blowup. I'm hoping we can investigate.
John Reppy and Aaron Turon told me a bit about what is going on at
UChicago. They are preparing to release an ML-Lex replacement, based on
Brzozowski's notion of regular expression derivatives. They hope to go on
to develop an ML-Yacc replacement (probably along the lines of ANTLR).
They have also been working on SML support in Eclipse. This is part of an
SML infrastructure grant, so it ought to yield some nice artifacts for the
community at large.
Umut Acar and Matthew Hammer from TTI-Chicago gave a 5-min madness talk at
SPACE, where they wondered how to get better performance out of their
self-adjusting computation infrastructure
(http://ttic.uchicago.edu/~umut/sting/). Their fundamental problem is
that their library allocates data contrary to the generational hypothesis.
I suggested that it might be possible to expose hooks like:
val MLton.GC.allocInOldGen : (unit -> 'a) -> 'a
val MLton.GC.allocInNursery : (unit -> 'a) -> 'a
that would cause allocations during the run of the thunk to happen in the
appropriate area of the heap.