[MLton] C codegen and world* regressions
Matthew Fluet
fluet@cs.cornell.edu
Sat, 17 Jun 2006 13:57:57 -0400 (EDT)
> The problem is due to changing the primtive from
> val save = _prim "World_save": C.Fd.t -> unit
> to
> val save = _prim "World_save": NullString8_t -> bool C_Error.t
>
> What happens is that the late stages of the compiler turn this primitive into
> a C call. When the program is saving a world, the ML code calls
> GC_saveWorld, and, in the absence of errors, control returns to the ML code
> _with a true return value_. When the program loads the saved world, the
> runtime initialization prepares the ML heap and then control returns to the
> ML code _without a return value_. Luck of the draw, it appears that the main
> function in x86-main.h happens to leave a non-zero value in %eax.
It occurs to me that this isn't quite a satisfactory explaination as to
why things worked with the native codegen. In particular, Wesley's
original change returned 'false' when the save succeeded and 'true' when
the save failed, and my subsequent change reversed the meaning to return
'true' when the save succeeded and 'false' when the save failed. Neither
change modified x86-main.h, so I would expect the jumpToML code to be
compiled the same, meaning that the (garbage) value in %eax should be the
same in both versions. But, that value can't be both 'true' and 'false'.
Or can it? ;-)
In Wesley's version, he had
val () =
SysCall.simple'
({errVal = true},
fn () => Prim.save (NullString.nullTerm file))
which induces a comparison between the return value and 'true'. The
representation pass represents 'true' as 1, so the native-codegen emits:
cmpl $0x1,%eax
je L_true_branch
L_false_branch:
In my version, I had
val () =
SysCall.simple'
({errVal = false},
fn () => Prim.save (NullString.nullTerm file))
which induces a comparison between the return value and 'false'. The
representation pass represents 'false' as 0, so the native-codegen emits:
testl %eax,%eax
jz L_true_branch
L_false_branch:
So, an %eax value not-equal to either 0 or 1 always follows the
L_false_branch (that is, the non-error branch). Now, it seems pretty
likely that garbage in %eax is some value that is neither 0 nor 1.
I note that this is a general 'issue' with MLton's FFI; a program like
this:
val f = _import "f": unit -> bool;
val () = if f () = true
then ...
else ...
yields Machine IL code like:
CCall {args = (),
frameInfo = Some {frameLayoutsIndex = 53},
func = {args = (),
bytesNeeded = None,
convention = cdecl,
ensuresBytesFree = false,
mayGC = true,
maySwitchThreads = false,
modifiesFrontier = true,
prototype = {args = (), res = Some Int32},
readsStackTop = true,
return = Word32,
target = f,
writesStackTop = true},
return = Some L_877}
L_877: {kind = CReturn {dst = Some RW32(0): Word32,
frameInfo = Some {frameLayoutsIndex = 53},
func = {args = (),
bytesNeeded = None,
convention = cdecl,
ensuresBytesFree = false,
mayGC = true,
maySwitchThreads = false,
modifiesFrontier = true,
prototype = {args = (), res = Some Int32},
readsStackTop = true,
return = Word32,
target = f,
writesStackTop = true}},
live = (SP(16): Pointers (pt_7),
SP(44): Pointers (pt_23),
SP(40): Pointers (pt_22)),
raises = None,
returns = Some ()}
RW32(1): Word32 = Word32_equal (RW32(0): Word32, 0x1)
switch {test = RW32(1): Word32,
default = None,
cases = ((0x0, L_876), (0x1, L_864))}
So, every codegen will compare the result value to 0x1. Similarly, a
program like:
val f = _import "f": unit -> bool;
val () = if f () = true
then ...
else ...
will yield a Machine IL proram that compares the result value to 0x0.
Meanwhile, a program like:
val f = _import "f": unit -> bool;
val () = if f ()
then ...
else ...
yields a Machile IL program with:
CCall {args = (),
frameInfo = Some {frameLayoutsIndex = 53},
func = {args = (),
bytesNeeded = None,
convention = cdecl,
ensuresBytesFree = false,
mayGC = true,
maySwitchThreads = false,
modifiesFrontier = true,
prototype = {args = (), res = Some Int32},
readsStackTop = true,
return = Word32,
target = f,
writesStackTop = true},
return = Some L_877}
L_877: {kind = CReturn {dst = Some RW32(0): Word32,
frameInfo = Some {frameLayoutsIndex = 53},
func = {args = (),
bytesNeeded = None,
convention = cdecl,
ensuresBytesFree = false,
mayGC = true,
maySwitchThreads = false,
modifiesFrontier = true,
prototype = {args = (), res = Some Int32},
readsStackTop = true,
return = Word32,
target = f,
writesStackTop = true}},
live = (SP(16): Pointers (pt_7),
SP(44): Pointers (pt_23),
SP(40): Pointers (pt_22)),
raises = None,
returns = Some ()}
switch {test = RW32(0): Word32,
default = None,
cases = ((0x0, L_876), (0x1, L_864))}
The semantics of a Machine IL 'switch' transfer is that the 'cases' (+
'default') exhaustively cover the 'test' value. So, the codegens are free
to optimize this to a comparison with either 0x0 or 0x1, assuming that an
unequal comparison implies that the value is the other value. I point this
out because if the C-function f returns a value which is neither zero nor
one, then the program compiled with two different codegens might exhibit
different behavior.
I think this is o.k.; all it means is that an ML 'bool' is not the same as
a C conditional expression; rather it is closer to the C99 bool_t type.
Using a value not equal to either zero or one for an ML 'bool' leads to
undefined behavior.
> 2) return the status of the save out-of-band; this is essentially the
> solution used by the thread primitives; we could add a
> val getSaveStatus =
> _import "GC_saveWorldSatus" : GCState.t -> bool C_Errno.t
> and check the status after the save. I think this is the way to go.
I've checked in a fix along these lines.