[MLton-user] more optimization questions
brian denheyer
briand@aracnet.com
Tue, 20 Dec 2005 22:19:49 -0800
> by a C call to a function that just calls fabs. The win in going from
> (2) to (3) is in eliminating the C wrapper around fabs. If anyone
> wants to repeat my experiment, I did (3) by adding a line to
> lib/mlton/include/c-chunk.h:
>
> #define Real64_abs fabs
>
Stephen,
Thanks very much for spending the time to look at this.
Just FYI for the list. The above change to the c-chunk.h file requires:
val abs = _import "fabs": real -> real;
to be added to the .sml file. I verified that the proper code was
generated using
-keep g.
My results are:
C 3.9s
sml + fabs 6.2s
orig 13s
So about a 60% difference.
Much better than the starting point of 240% !
Now for the gotcha.
My effort to construct the 2D example was to examine the array
indexing more easily. I brought up the 2D program as an example of
the slow down, because it seemed to track the performance of the 3D
program relative to C and therefore led me to believe the problem was
in the indexing.
However my 3D program is a different routine, and does NOT use fabs.
So it's back to the drawing board. However I have some good tools
now to investigate the performance.
Also, it seems like in 2 years there is a relatively good (?) chance
that the fabs behavior has changed. Maybe the "proper" abs code is
no longer required in the compiler.
Brian