[MLton-user] timing anomoly
Florian Weimer
fw at deneb.enyo.de
Tue Dec 4 13:04:12 PST 2007
* Sean McLaughlin:
> However, is it possible to make this code run faster? It currently runs
> over 3X slower than the equivalent C program:
GCC seems to be better at simplifying the polynomial, for some reason.
After substituiting x1 .. x6 with x,
(-x1)*x4 + x2*x5 +(-x3)*x5 +(-x2)*x6 + x3*x6 +
x4*(x2 + x3 + x5 + x6) + x4*(-x1)+x4*(-x4)
becomes
(-x)*x + x*x +(-x)*x +(-x)*x + x*x +
x*(x + x + x + x) + x*(-x)+x*(-x)
which is basically
x*x
(if I'm not mistaken). I don't know why MLton doesn't perform the
transformation. I haven't seen floating point semantics for SML, but
the optimization certainly result in different results when intermediate
values would overflow. This seems contrary to the ML spirit, but I fear
it's essential for decent floating-point performance in some cases (in
the sense that you may calculate the expression in a way that brings it
closer to the real result).
More information about the MLton-user
mailing list