[MLton] native x86-64 code-gen
Daniel C. Wang
danwang@CS.Princeton.EDU
Mon, 04 Jul 2005 15:16:36 -0700
So, I did some digging on the web to understand the x86-64 32-bit
backward compatibility story.
It seems you have the choice of running as a 64-bit app with all the new
extra registers and other goodies, or just running as if you were a
plain old 32-bit x86.
Given how closely the 64-bit ISA is to the 32-bit ISA, I suspect porting
the current native code gen to x86-64 would not be too much work, a few
tweaks to the register allocator and you'll get some extra registers,
and access to native 64-bit arithmetic, and any other nice goodies. All
the work of course is going to be in porting the runtime and other parts
of the system that assume 32-bit pointers.
Like, I said the alpha had a legacy 32-bit mode that was designed for
32-bit apps ported to the alpha. The OS just made sure all memory the
user app could touch was in the lower 32-bit address space. It seems
this trick isn't supported on Linux x86-64. I guess they assume you
just want binary compatiblity or else you ought to make your program
64-bit clean. So, I'm just wondering how much work it would be to
implement a similar approach for MLton.
Tweak the current native code-gen to use the extra regs and generate a
64-bit program that assumes all pointers are 32-bit in size. This might
require some extra masking when dealing with address.
Modify the runtime to make sure the MLton heap and stack live in the
lower 32-bit address space.
So this almost works except for dealing with the FFI, and a situation
where the runtime returns a 64-bit pointer that needs to be stored in an
MLton data structure.
So the advantage of what I outlined above, as well as being simpler than
a full 64-bit port. Is that pointers stay 32-bit, since lots of ML
sturcutures are pointer intensive a full 64-bit approach is likely to
almost double the size of many small objects, and eat up space in the
cache for no good reason.
So does this at all sound plausible, or is there some very nasty gotach
that I've overlooked?