[MLton-user] minor 32bit vs. 64bit differences in floating-point
calculations with large numbers
Wesley W. Terpstra
wesley at terpstra.ca
Fri Oct 15 09:41:01 PDT 2010
On Fri, Oct 15, 2010 at 3:12 PM, David Hansel
<hansel at reactive-systems.com>wrote:
> Thanks for that information. Just one more question
> (more out of curiosity than anything else): is there
> a technical reason that 64-bit MLton does NOT use the
> FPU?
Well, I think Matthew considered the SSE2 instructions superior to the x87
instructions. SSE2 is required on all 64-bit machines.
You'd have to ask Matthew for the details, but my understanding is that
registers are easier to work with than a register stack. You can use
traditional register allocation algorithms. Since the FPU has 8 slots in the
stack and SSE2 on 64-bit has 16, you also get double the "registers" plus
"random access". Plus, you can perform your calculations in 32-bit or 64-bit
math in every step; not needing to worry about the extra precision.
Personally, I'd love to see 128-bit floating point using SSE2 registers;
then the x87 would be completely obsolete.
Or is this just a case of "nobody has implemented
> that yet"?
Well, the amd64 and x86 codegens are very similar. You could probably mostly
cut-and-paste the x86 FPU instructions out of the x86 codegen with very
little trouble.
A perhaps better question might be: how hard would it be to port the SSE2
math to i386. ;) Most modern 32-bit processors also have SSE2. Then you
could have -ieee-fp sse2 on i386 to get the same effect as -ieee-fp true,
but even faster than -ieee-fp false (but requiring SSE2 support, of course).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton-user/attachments/20101015/f10e9c2f/attachment.htm
More information about the MLton-user
mailing list