[MLton] native x86-64 code-gen

Daniel C. Wang danwang@CS.Princeton.EDU
Mon, 04 Jul 2005 15:16:36 -0700

So, I did some digging on the web to understand the x86-64 32-bit 
backward compatibility story.
It seems you have the choice of running as a 64-bit app with all the new 
extra registers and other goodies, or just running as if you were a 
plain old 32-bit x86.

Given how closely the 64-bit ISA is to the 32-bit ISA, I suspect porting 
the current native code gen to x86-64 would not be too much work, a few 
tweaks to the register allocator and you'll get some extra registers, 
and access to native 64-bit arithmetic, and any other nice goodies. All 
the work of course is going to be in porting the runtime and other parts 
of the system that assume 32-bit pointers.

Like, I said the alpha had a legacy 32-bit mode that was designed for 
32-bit apps ported to the alpha. The OS just made sure all memory the 
user app could touch was in the lower 32-bit address space. It seems 
this trick isn't supported on Linux  x86-64. I guess they assume you 
just want binary compatiblity or else you ought to make your program 
64-bit clean. So, I'm just wondering how much work it would be to 
implement a similar approach for MLton.

Tweak the current native code-gen to use the extra regs and generate a 
64-bit program that assumes all pointers are 32-bit in size. This might 
require some extra masking when dealing with address.
Modify the runtime to make sure the MLton heap and stack live in the 
lower 32-bit address space.

So this almost works except for dealing with the FFI, and a situation 
where the runtime returns a 64-bit pointer that needs to be stored in an 
MLton data structure.

So the advantage of what I outlined above, as well as being simpler than 
a full 64-bit port. Is that pointers stay 32-bit, since lots of ML 
sturcutures are pointer intensive a full 64-bit approach is likely to 
almost double the size of many small objects, and eat up space in the 
cache for no good reason.

So does this at all sound plausible, or is there some very nasty gotach 
that I've overlooked?