[MLton] bug report, vector of char problem

Stephen Weeks MLton@mlton.org
Wed, 6 Apr 2005 11:40:36 -0700


> > Just out of curiosity anyone have an informal list of subtle bugs that 
> > didn't tickle a bug in the typed-IL? We were having a lunch discussion about 
> > if typed-IL really prevented compiler bugs... I'm wondering what the MLton 
> > experience has been.
...
> I think our experience has been very positive.  We throw in to the 
> 'type-checker' for our typed ILs a bit more than just types; for example, 
> the SSA IL checks the def dominates use property as well. 

I strongly echo Matthew's sentiments about our positive experience
with typed ILs.  They have caught innumerable bugs, many of which
would have taken a long time to debug because code generation would
have completed and the bug would affect memory in very unpredictable
ways.

However, the changelog is not a good indicator of how helpful typed
ILs are, as most of their use comes when developing a new pass, while
changes and debugging is fast and furious, before anything makes it
into the changelog.

The places where MLton doesn't have typed ILs, in Rssa, Machine, and
the codegen, have easily been the source of the most, and most
challenging, bugs.  Also, the lack of a type system to check the GC
and the interface to the GC has been source of bugs.

Basically, types in the ILs help to avoid at least as many bugs as one
is used to types helping in the source language when programming,
except that the bugs in the ILs are even more important to catch
early, since they can lead to seg faults.

Of course, types don't catch everything.  The current "vector of char"
bug is due to the pass that determines data representations
(SsaToRssa) getting confused about the flattened array of (char *
char).  It correctly uses 2 bytes per char, but incorrectly stores
both chars in the same 1-byte slot of each element.  It's not likely
that a type system would catch the overlapping use, although it might
catch the fact that the other 1-byte slot is uninitialized.