[MLton] C codegen and world* regressions
Matthew Fluet
fluet@cs.cornell.edu
Sat, 17 Jun 2006 16:49:38 -0400 (EDT)
>> val f = _import "f": unit -> bool;
> ...
>> I point this out because if the C-function f returns a value which
>> is neither zero nor one, then the program compiled with two
>> different codegens might exhibit different behavior.
>>
>> I think this is o.k.; all it means is that an ML 'bool' is not the
>> same as a C conditional expression; rather it is closer to the C99
>> bool_t type. Using a value not equal to either zero or one for an ML
>> 'bool' leads to undefined behavior.
>
> That seems OK to me too. The old runtime/basis was sloppy about this,
> confusing bools and ints. Hopefully the new runtime will completely
> clarify things. Also once this is all done, the representation pass
> can be tweaked so it packs booleans as single bits -- right now it
> keeps them as 32 bits, at least partially because of confusion between
> C booleans and ints.
I guess I don't quite follow. As of right now, while we try to be a
little better about confusing ints and bools, I don't think we're getting
any more guarantees. The Bool_t typedef in ml-types.h (which is what the
C-side is using for an ML bool), simply has
typedef int32_t Int32_t;
typedef Int32_t Bool_t;
So, even if we promise to return an ML bool by using Bool_t, C doesn't
help us much, since it will happily allow
Bool_t foo() {
return 42;
}
and foo will pass 42 back as the return value.
So, I don't know how things are clarified. We're trying to document our
intentions better, but I don't think we have any more guarantees.
Maybe you are suggesting that we have
typedef bool Bool_t;
> I also see that
>
> http://mlton.org/ForeignFunctionInterfaceTypes
>
> says that the SML bool type is equivalent to the C Int32 type. We'll
> need to change this to C "bool" type and to warn people about the new,
> more precise type (and the fact that the value must be 0 or 1).
I don't know if that quite works, because I'm not sure that the natural C
representation of bool[] is necessarily equal to MLton's representation of
bool array. In particular, we could turn it into a bit-array, while I
think C will make it a 8-bit char array. So you still have pin things
down to what the C-side does.
Would removing bool from the FFI types be a horrible idea?
> I looked at all the uses of Bool_t in basis-ffi.h to see if the
> functions exported by the runtime were confusing bools and ints. I
> found a few that I'm not sure about.
Yeah, I added looking for uses of Bool_t to my todo.
> 1. The functions
>
> PosixFileSys_ST_is{Blk,Chr,Dir,FIFO,Link,Reg,Sock}
>
> all return Bool_t by calling the corresponding macro
>
> S_IS{BLK,CHR,DIR,FIFO,LNK,REG,SOCK}
>
> On my Linux machine these all expand to relational expressions (==),
> but I don't know if the standard (or standard practice) guarantees
> these to be booleans.
http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html
The following macros shall be provided to test whether a file is of the
specified type. The value m supplied to the macros is the value of st_mode
from a stat structure. The macro shall evaluate to a non-zero value if the
test is true; 0 if the test is false.
> 2. The function Posix_ProcEnv_isatty returns a Bool_t by calling
> isatty, which on my machine is specified to return an int.
That's probably right.
> 3. The functions
>
> Posix_Process_if{Exited,Signaled,Stopped}
>
> all return Bool_t by calling the corresponding macro
>
> WIF{EXIT,SIGNAL,STOPP}ED
>
> Again, on my machine these all expand to relational expressions,
> but I don't know if the standard guarantees these to be booleans.
http://www.opengroup.org/onlinepubs/009695399/functions/wait.html
WIFEXITED(stat_val)
Evaluates to a non-zero value if status was returned for a child
process that terminated normally.
WIFSIGNALED(stat_val)
Evaluates to a non-zero value if status was returned for a child
process that terminated due to the receipt of a signal that was not caught
(see <signal.h>).
WIFSTOPPED(stat_val)
Evaluates to a non-zero value if status was returned for a child
process that is currently stopped.
> I wonder if the right thing to do for situations where we are unsure
> or have C functions that return int instead of bool, is to expose them
> as int in the FFI and convert the int type to bool in SML code with a
> comparison to zero.
If we disallowed bool in indirect FFI types (i.e., bool ref, bool array,
bool vector), I could imagine implementing this as a type directed
elaboration of FFI functions.