<div class="gmail_quote">On Fri, Oct 15, 2010 at 3:12 PM, David Hansel <span dir="ltr">&lt;<a href="mailto:hansel@reactive-systems.com">hansel@reactive-systems.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Thanks for that information.  Just one more question<br>

(more out of curiosity than anything else):  is there<br>

a technical reason that 64-bit MLton does NOT use the<br>

FPU?</blockquote><div><br>Well, I think Matthew considered the SSE2 instructions superior to the x87 instructions. SSE2 is required on all 64-bit machines.<br><br>You&#39;d have to ask Matthew for the details, but my understanding is that registers are easier to work with than a register stack. You can use traditional register allocation algorithms. Since the FPU has 8 slots in the stack and SSE2 on 64-bit has 16, you also get double the &quot;registers&quot; plus &quot;random access&quot;. Plus, you can perform your calculations in 32-bit or 64-bit math in every step; not needing to worry about the extra precision. <br>

<br>Personally, I&#39;d love to see 128-bit floating point using SSE2 registers; then the x87 would be completely obsolete.<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Or is this just a case of &quot;nobody has implemented<br>

that yet&quot;? </blockquote><div><br></div></div>Well, the amd64 and x86 codegens are very similar. You could probably mostly cut-and-paste the x86 FPU instructions out of the x86 codegen with very little trouble.<br><br>

A perhaps better question might be: how hard would it be to port the SSE2 math to i386. ;) Most modern 32-bit processors also have SSE2. Then you could have -ieee-fp sse2 on i386 to get the same effect as -ieee-fp true, but even faster than -ieee-fp false (but requiring SSE2 support, of course).<br>