[MLton] Code being dropped that shouldn't
Matthew Fluet
fluet at tti-c.org
Thu Oct 9 20:00:13 PDT 2008
On Wed, 8 Oct 2008, Wesley W. Terpstra wrote:
> I've been trying to track down the last bug in the library regression
> test, but I think now it's not a bug in the test at all. Try an i386
> compiler (windows and linux both show the same behaviour). Run:
> ./library-test -debug true -keep g -codegen c -profile time
...
> I know there's some sort of basis-specific code
> elimination. Is it possible that librarySuffix is getting hit by this?
No.
Turns out that it was a (long-standing) bug with 'MLton_callFromC' in
c-main.h. In the library test, there is a point when the call-stack for
libm5 looks like:
m5_open()
libm5.sml (* top-level SML-code, via m5_open trampoline *)
libm5confirmC()
libm5smlFn{Private,Public}() (* _export-ed from libm5.sml *)
MLton_callFromC()
libm5.sml (* libm5smlFn{Private,Public}, via MLton_callFromC trampoline *)
The _export-ed 'libm5smlFn{Private,Public}' functions terminate the
'MLton_callFromC' trampoline by setting the global 'returnToC' to 'TRUE'.
This returns to the top-level SML-code from libm5.sml, executed via the
'm5_open' trampoline. However, the 'returnToC' global is set to 'TRUE',
so the trampoline terminates at the first inter-chunk transfer, without
completing the top-level SML-code from libm5.sml; in particular, it
doesn't execute to the 'suffixArchiveOrLibrary' and doesn't suspend the ML
stack at the point just before the 'clean atExit'. Rather, it suspends
the ML stack much earlier in the execution, and when the top-level C-code
(as executed by check.sml) executes 'm5_close', it starts a new trampoline
(setting 'returnToC' to 'FALSE') and resumes the ML code somewhere shortly
after the call to 'libm5confirmC()', at which point it executes to the
first 'Primitive.MLton.Thread.returnToC' in 'suffixArchiveOrLibrary',
returns to 'm5_close'. That explains why the 'clean atExit' never
executed.
The reasons that this erroneous behavior doesn't occur with all of the
C-codegen library tests is that it depends upon how the program is
chunkify-ed. This is simply the coarse grouping of RSSA IL functions into
larger 'chunks' to create larger/fewer .c files. However, it is a
sized-based grouping, so small changes to the RSSA IL (such as the extra
code inserted for time profiling) can change the grouping and cause an
inter-chunk transfer to manifest the bug. One can cause the bug to
manifest in the other C-codegen tests by compiling with '-chunkify func',
which uses the finest chunking (increasing the number of inter-chunk
transfers).
The fix is trivial; set 'returnToC' to 'FALSE' after completing a
'MLton_callFromC' trampoline:
Modified: mlton/trunk/include/c-main.h
===================================================================
--- mlton/trunk/include/c-main.h 2008-10-09 21:35:20 UTC (rev 6918)
+++ mlton/trunk/include/c-main.h 2008-10-10 02:22:13 UTC (rev 6919)
@@ -39,6 +39,7 @@
do { \
cont=(*(struct cont(*)(void))cont.nextChunk)(); \
} while (not returnToC); \
+ returnToC = FALSE; \
s->atomicState += 1; \
GC_switchToThread (s, GC_getSavedThread (s), 0); \
s->atomicState -= 1; \
More information about the MLton
mailing list