more heap sort benchmark info
Henry Cejtin
henry@sourcelight.com
Wed, 20 Jun 2001 20:39:56 -0500
Well, I am really confused about this heapsort benchmark. I took the C code
and our version (the `opt') of the SML code and tweaked it so that it takes 3
arguments:
The number of times to create the initial array. Each call is an
Array.tabulate with the function calling gen_random.
The number of times to call the heap sort. Each call copies the array
(into a copy which is only allocated once per program run, not once
per heap sort call) and then calls heap sort on this copied array.
the number of elements in the arrays.
Using this I can determine how long each part takes. The timings, using
80000 elements, the biggest size he uses, are
create initial array: C .020385 seconds
MLton .012050 seconds
copy and sort: C .136650 seconds
MLton .191040 seconds
Note, we are faster than C in the creating of the array, even though the C
code doesn't do multiple allocations, it allocates the array once and then
fills it in a bunch of times. I believe that the reason is that MLton, being
no fool, has inlined the gen_random function, while gcc did not. This causes
80,000 extra function calls, and if each takes about .5 mics (longer than I
would expect) then it would explain things.
The real point to notice is that in the actual sorting, we are only about 40%
slower than gcc. This isn't good, but if you look at his graphs, it seems
that MLton is 5 times slower. Even given that the new MLton is faster, I
have no explanation.
On Doug's machine, it looks like OCaml is 36% slower than GCC, so even
ignoring the speed up in the initialization part, we should be very close.