[MLton-user] timing anomoly

Tue Dec 4 13:04:12 PST 2007

* Sean McLaughlin:

> However, is it possible to make this code run faster?  It currently runs
> over 3X slower than the equivalent C program:

GCC seems to be better at simplifying the polynomial, for some reason.

After substituiting x1 .. x6 with x,

  (-x1)*x4 + x2*x5 +(-x3)*x5 +(-x2)*x6 + x3*x6 +
      x4*(x2 + x3 + x5 + x6) + x4*(-x1)+x4*(-x4)

becomes

  (-x)*x + x*x +(-x)*x +(-x)*x + x*x +
     x*(x + x + x + x) + x*(-x)+x*(-x)

which is basically 

  x*x

(if I'm not mistaken).  I don't know why MLton doesn't perform the
transformation.  I haven't seen floating point semantics for SML, but
the optimization certainly result in different results when intermediate
values would overflow.  This seems contrary to the ML spirit, but I fear
it's essential for decent floating-point performance in some cases (in
the sense that you may calculate the expression in a way that brings it
closer to the real result).