Mentors
The following developers have agreed to serve as mentors for the 2013 Google Summer of Code:
Ideas List
Implement a Partial Redundancy Elimination (PRE) Optimization
Partial redundancy elimination (PRE) is a program transformation that removes operations that are redundant on some, but not necessarily all paths, through the program. PRE can subsume both common subexpression elimination and loop-invariant code motion, and is therefore a potentially powerful optimization. However, a naïve implementation of PRE on a program in static single assignment (SSA) form is unlikely to be effective. This project aims to adapt and implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton’s SSA intermediate language.
Background:
-
Anticipation-based partial redundancy elimination for static single assignment form; Thomas VanDrunen and Antony L. Hosking
-
Partial Redundancy Elimination for Global Value Numbering; Thomas VanDrunen
-
Value-Based Partial Redundancy Elimination; Thomas VanDrunen and Antony L. Hosking
-
Partial redundancy elimination in SSA form; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
Recommended Skills: SML programming experience; some middle-end compiler experience
Design and Implement a Heap Profiler
A heap profile is a description of the space usage of a program. A heap profile is concerned with the allocation, retention, and deallocation (via garbage collection) of heap data during the execution of a program. A heap profile can be used to diagnose performance problems in a functional program that arise from space leaks. This project aims to design and implement a heap profiler for MLton compiled programs.
Background:
-
GCspy: an adaptable heap visualisation framework; Tony Printezis and Richard Jones
-
New dimensions in heap profiling; Colin Runciman and Niklas Röjemo
-
Heap profiling for space efficiency; Colin Runciman and Niklas Röjemo
-
Heap profiling of lazy functional programs; Colin Runciman and David Wakeling
Recommended Skills: C and SML programming experience; some experience with UI and visualization
Garbage Collector Improvements
The garbage collector plays a significant role in the performance of functional languages. Garbage collect too often, and program performance suffers due to the excessive time spent in the garbage collector. Garbage collect not often enough, and program performance suffers due to the excessive space used by the uncollected garbage. One particular issue is ensuring that a program utilizing a garbage collector "plays nice" with other processes on the system, by not using too much or too little physical memory. While there are some reasonable theoretical results about garbage collections with heaps of fixed size, there seems to be insufficient work that really looks carefully at the question of dynamically resizing the heap in response to the live data demands of the application and, similarly, in response to the behavior of the operating system and other processes. This project aims to investigate improvements to the memory behavior of MLton compiled programs through better tuning of the garbage collector.
Background:
-
Automated Heap Sizing in the Poly/ML Runtime (Position Paper); David White, Jeremy Singer, Jonathan Aitken, and David Matthews
-
Isla Vista Heap Sizing: Using Feedback to Avoid Paging; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
-
Controlling garbage collection and heap growth to reduce the execution time of Java applications; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
-
Garbage collection without paging; Matthew Hertz, Yi Feng, and Emery D. Berger
-
Automatic heap sizing: taking real memory into account; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
Implement Successor ML Language Features
Any programming language, including Standard ML, can be improved. The community has identified a number of modest extensions and revisions to the Standard ML programming language that would likely prove useful in practice. This project aims to implement these language features in the MLton compiler.
Background:
-
A critique of Standard ML; Andrew W. Appel
Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers)
Implement Source-level Debugging
Debugging is a fact of programming life. Unfortunately, most SML implementations (including MLton) provide little to no source-level debugging support. This project aims to add basic to intermediate source-level debugging support to the MLton compiler. MLton already supports source-level profiling, which can be used to attribute bytes allocated or time spent in source functions. It should be relatively straightforward to leverage this source-level information into basic source-level debugging support, with the ability to set/unset breakpoints and step through declarations and functions. It may be possible to also provide intermediate source-level debugging support, with the ability to inspect in-scope variables of basic types (e.g., types compatible with MLton’s foreign function interface).
Background:
Recommended Skills: SML programming experience; some compiler experience
SIMD Primitives
Most modern processors offer some direct support for SIMD (Single Instruction, Multiple Data) operations, such as Intel’s MMX/SSE instructions, AMD’s 3DNow! instructions, and IBM’s AltiVec. Such instructions are particularly useful for multimedia, scientific, and cryptographic applications. This project aims to add preliminary support for vector data and vector operations to the MLton compiler. Ideally, after surveying SIMD instruction sets and SIMD support in other compilers, a core set of SIMD primitives with broad architecture and compiler support can be identified. After adding SIMD primitives to the core compiler and carrying them through to the various backends, there will be opportunities to design and implement an SML library that exposes the primitives to the SML programmer as well as opportunities to design and implement auto-vectorization optimizations.
Background:
Recommended Skills: SML programming experience; some compiler experience; some computer architecture experience
RTOS Support
This project entails porting the MLton compiler to RTOSs such as: RTEMS, RT Linux, and FreeRTOS. The project will include modifications to the MLton build and configuration process. Students will need to extend the MLton configuration process for each of the RTOSs. The MLton compilation process will need to be extended to invoke the C cross compilers the RTOSs provide for embedded support. Test scripts for validation will be necessary and these will need to be run in emulators for supported architectures.
Recommended Skills: C programming experience; some scripting experience
Region Based Memory Management
Region based memory management is an alternative automatic memory management scheme to garbage collection. Regions can be inferred by the compiler (e.g., Cyclone and MLKit) or provided to the programmer through a library. Since many students do not have extensive experience with compilers we plan on adopting the later approach. Creating a viable region based memory solution requires the removal of the GC and changes to the allocator. Additionally, write barriers will be necessary to ensure references between two ML objects is never established if the left hand side of the assignment has a longer lifetime than the right hand side. Students will need to come up with an appropriate interface for creating, entering, and exiting regions (examples include RTSJ scoped memory and SCJ scoped memory).
Background:
-
Cyclone
-
MLKit
-
RTSJ + SCJ scopes
Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
Integration of Multi-MLton
MultiMLton is a compiler and runtime environment that targets scalable multicore platforms. It is an extension of MLton. It combines new language abstractions and associated compiler analyses for expressing and implementing various kinds of fine-grained parallelism (safe futures, speculation, transactions, etc.), along with a sophisticated runtime system tuned to efficiently handle large numbers of lightweight threads. The core stable features of MultiMLton will need to be integrated with the latest MLton public release. Certain experimental features, such as support for the Intel SCC and distributed runtime will be omitted. This project requires students to understand the delta between the MultiMLton code base and the MLton code base. Students will need to create build and configuration scripts for MLton to enable MultiMLton features.
Background
Recommended Skills: SML programming experience; C programming experience; some compiler experience