[MLton-user] minor 32bit vs. 64bit differences in floating-point calculations with large numbers

Fri Oct 15 09:41:01 PDT 2010

On Fri, Oct 15, 2010 at 3:12 PM, David Hansel
<hansel at reactive-systems.com>wrote:

> Thanks for that information.  Just one more question
> (more out of curiosity than anything else):  is there
> a technical reason that 64-bit MLton does NOT use the
> FPU?

Well, I think Matthew considered the SSE2 instructions superior to the x87
instructions. SSE2 is required on all 64-bit machines.

You'd have to ask Matthew for the details, but my understanding is that
registers are easier to work with than a register stack. You can use
traditional register allocation algorithms. Since the FPU has 8 slots in the
stack and SSE2 on 64-bit has 16, you also get double the "registers" plus
"random access". Plus, you can perform your calculations in 32-bit or 64-bit
math in every step; not needing to worry about the extra precision.

Personally, I'd love to see 128-bit floating point using SSE2 registers;
then the x87 would be completely obsolete.

Or is this just a case of "nobody has implemented
> that yet"?

Well, the amd64 and x86 codegens are very similar. You could probably mostly
cut-and-paste the x86 FPU instructions out of the x86 codegen with very
little trouble.

A perhaps better question might be: how hard would it be to port the SSE2
math to i386. ;) Most modern 32-bit processors also have SSE2. Then you
could have -ieee-fp sse2 on i386 to get the same effect as -ieee-fp true,
but even faster than -ieee-fp false (but requiring SSE2 support, of course).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton-user/attachments/20101015/f10e9c2f/attachment.htm