[MLton] Type cleanup for 64bit port (was: sparc works)

Wesley W. Terpstra terpstra@gkec.tu-darmstadt.de
Thu, 23 Dec 2004 05:27:10 +0100


On Wed, Dec 22, 2004 at 06:37:33PM -0800, Stephen Weeks wrote:
> Done, except I used -1 for FE_NOSUPPORT.

That's fine. My weird number was to avoid conflict with the other FE_*.
A closer reading indicates that each one of those must a single bit.
Therefore, -1 = all bits should indeed be a safe choice.

> > I suppose you could define Pid, Uid, Time, Size, etc to be the smallest 
> > SML compiler supported type larger than pid_t, uid_t, time_t, size_t, etc.
> 
> Sounds good to me.

I'm still uncomfortable with this.

If we take this approach we should be very disciplined in the runtime to
immediately cast the parameter to the correct C type. Otherwise, macros
and variable-argument functions may run amock. You can never tell what
function might be a macro when it comes to C...

I assume below that: typedef int Int; ie: Int is not SML's int.
(A good rule might be that SML int must be >= C int in size.
 Perhaps also <= C long.)

eg:

Pid Posix_Process_waitpid (Pid p_, Pointer s_, Int i) {
	pid_t p = (pid_t)p_;
	int*  s = (int*) s_;
	
	return (Pid)waitpid(p, s, i);
}

To see why we need to do this, imagine that waitpid is implemented as a
macro and pid_t is 40 bit number. waitpid might do this:

#define __kernel_pid(p)	(~p >> 1)
#define waitpid(p, s, i) __sys_waitpid(__kernel_pid(p), s, i)

Suppose p = 0x0000 00FF FFFF FFFF (all 40 bits set -- stored in 64bits).
Then the correct kernel_pid is 0x00 0000 0000.

However, if we let C have it's way, what we actually pass is 
0x80 0000 0000. Oops!

This is not very far fetched; I read the cygwin code while looking into
MLton.Process. It does something very similar, but thankfully inside a
function call, and the type is 32bit. However, there's no reason they might
not at some point decide to 'optimize' it with an inlining macro.

The problem with variable argument methods (like printf) is that they won't
convert the type for us, it will just keep its current (too big) size. 

In my example, the return cast is not necessary.
However, if we store the result and proceed to calculate with it further, we
must make certain we use a pid_t at every step except the last where it is
cast back to the bigger SML type.

-- 
Wesley W. Terpstra