more profiling: benchmarks
Stephen Weeks
sweeks@intertrust.com
Mon, 28 Jun 1999 23:29:11 -0700 (PDT)
Here is the profiling data for all of the usual benchmarks. Henry is
working on the "I ran out of addresses" problem.
----------------------------------------
barnes-hut
1078 ticks total
walksub_0 52.88%
gravsub_0 31.17%
insert_0 4.73%
main_0 3.25%
aux_0 1.58%
MLTON_chunkSwitch (magic) 1.39%
doit_0 0.83%
hackCofM_0 0.83%
setRoundingMode (C function) 0.83%
x_12 0.83%
round (C function) 0.74%
getRoundingMode (C function) 0.65%
class (C function) 0.19%
forward (C function) 0.09%
----------------------------------------
count-graphs
1226 ticks total
outer_0 40.54%
x_353 25.12%
outer_1 12.64%
f_1 6.85%
merge_0 6.69%
MLTON_chunkSwitch (magic) 2.94%
x_132 2.85%
x_250 2.04%
operator_0 0.33%
----------------------------------------
fft
3391 ticks total
main_0 95.34%
intQuot (C function) 3.21%
div_0 1.06%
x_0 0.29%
MLTON_chunkSwitch (magic) 0.09%
----------------------------------------
knuth-bendix
1791 ticks total
x_240 29.31%
rewrec_0 28.53%
stringEqual (C function) 11.22%
redrec_0 6.25%
MLTON_chunkSwitch (magic) 5.25%
map_rec_0 3.80%
vars_of_list_0 2.46%
MLTON_gc (magic) 2.01%
x_461 1.45%
next_criticals_0 1.28%
map_rec_8 0.95%
tryrec_0 0.84%
union_rec_0 0.84%
MLTON_endChunkSwitch (magic) 0.73%
unify_0 0.73%
part_rec_0 0.61%
suprec_0 0.56%
MLton_main (magic) 0.45%
assoc_rec_0 0.45%
forward (C function) 0.34%
rew_loop_0 0.28%
map_rec_3 0.22%
map_rec_5 0.22%
x_645 0.17%
equal_0 0.11%
foreachPointerInHeap (C function) 0.11%
occur_rec_0 0.11%
processkl_0 0.11%
x_215 0.11%
Group_rank_0 0.06%
intQuot (C function) 0.06%
intRem (C function) 0.06%
map_rec_2 0.06%
map_rec_4 0.06%
map_rec_7 0.06%
mem_rec_0 0.06%
pretty_term_0 0.06%
x_125 0.06%
----------------------------------------
lexgen
3244 ticks total
x_150 27.96%
visit_0 21.70%
x_74 8.35%
x_1805 5.02%
makeEntry_0 4.38%
MLTON_gc (magic) 3.79%
MLTON_chunkSwitch (magic) 2.99%
update_0 2.74%
intRem (C function) 2.71%
x_102 2.71%
copyVec_0 2.62%
intQuot (C function) 2.13%
f_4 2.00%
emit8_0 1.51%
union_0 1.42%
checkSlice_1 1.17%
update_1 1.14%
output_0 1.02%
store_0 0.80%
main_0 0.77%
AdvanceTok_0 0.71%
MLton_main (magic) 0.71%
forward (C function) 0.43%
MLTON_endChunkSwitch (magic) 0.37%
checkSlice_0 0.31%
getch_0 0.15%
update_3 0.15%
exp0_0 0.06%
foreachPointerInHeap (C function) 0.06%
skipws_0 0.06%
nullable_0 0.03%
----------------------------------------
life
5365 ticks total
main_0 86.06%
lexordset_0 8.63%
neighbours_0 5.18%
MLTON_chunkSwitch (magic) 0.11%
MLTON_gc (magic) 0.02%
----------------------------------------
logic
4992 ticks total
x_39 20.45%
x_20 19.91%
MLTON_gc (magic) 12.16%
x_456 9.54%
oc_2 7.19%
deref_0 5.33%
MLTON_chunkSwitch (magic) 4.65%
oc_3 3.77%
x_127 3.10%
unwind_trail_0 2.56%
MLton_main (magic) 1.96%
stringEqual (C function) 1.86%
MLTON_endChunkSwitch (magic) 1.80%
x_431 1.78%
main_0 1.40%
forward (C function) 0.46%
x_481 0.44%
oc_1 0.38%
x_124 0.30%
x_42 0.28%
foreachPointerInHeap (C function) 0.12%
oc_4 0.12%
x_344 0.08%
x_384 0.06%
x_399 0.06%
x_407 0.06%
x_368 0.04%
x_391 0.04%
x_337 0.02%
x_352 0.02%
x_360 0.02%
x_376 0.02%
----------------------------------------
mandelbrot
1561 ticks total
main_0 100.00%
----------------------------------------
matrix-multiply
1310 ticks total
main_0 99.85%
array_0 0.15%
----------------------------------------
mlyacc
I ran out of addresses
----------------------------------------
nucleic
1669 ticks total
x_275 45.06%
x_266 19.95%
x_50 14.80%
x_15 10.37%
x_168 4.37%
x_0 2.34%
x_256 1.14%
MLTON_chunkSwitch (magic) 0.90%
x_339 0.72%
forward (C function) 0.30%
tfo_inv_ortho_0 0.06%
----------------------------------------
ratio-regions
2186 ticks total
relabel_0 29.92%
preflow_push_0 24.84%
MLTON_gc (magic) 15.51%
enqueue_1 13.68%
MLton_main (magic) 2.88%
MLTON_chunkSwitch (magic) 2.65%
MLTON_endChunkSwitch (magic) 2.33%
can_lift_0 1.51%
can_push_right_0 1.42%
enqueue_0 1.37%
can_push_up_0 1.19%
can_push_down_0 1.14%
can_push_left_0 0.73%
forward (C function) 0.32%
main_0 0.18%
foreachPointerInHeap (C function) 0.14%
intQuot (C function) 0.09%
make_matrix_0 0.09%
----------------------------------------
simple
1977 ticks total
MLTON_gc (magic) 21.65%
polynomial_0 16.44%
pow_0 9.81%
MLTON_chunkSwitch (magic) 8.45%
index_0 5.97%
x_2199 5.56%
from_1 4.35%
MLTON_endChunkSwitch (magic) 4.25%
MLton_main (magic) 3.84%
x_274 3.39%
f2_0 2.73%
line_integral_0 2.23%
x_2644 2.23%
main_0 1.57%
x_1752 0.86%
x_7525 0.76%
zone_area_vol_0 0.76%
table_search_0 0.71%
x_2296 0.61%
x_2572 0.61%
flatten_0 0.56%
x_2435 0.56%
x_2370 0.51%
x_2507 0.51%
x_2308 0.35%
from_0 0.25%
x_2316 0.20%
f2_1 0.10%
f2_2 0.10%
reflect_node_0 0.05%
work_done_0 0.05%
----------------------------------------
tsp
2445 ticks total
tsp_0 75.05%
locateCycle_0 11.37%
build_0 3.31%
median_0 2.37%
intQuot (C function) 2.17%
drand48_0 2.09%
makeList_0 1.80%
intRem (C function) 0.74%
MLTON_chunkSwitch (magic) 0.49%
mod_0 0.37%
div_0 0.16%
forward (C function) 0.08%
----------------------------------------
vliw
I ran out of addresses
----------------------------------------
zern
4713 ticks total
main_0 100.00%
----------------------------------------
Even more interestingly, here is the same data with only the "magic"
stuff included. Anything labeled magic is infrastructure crap that
had to be put in because we're compiling to C. These costs should
trivially go to zero with a native backend. This will give decent
speedups for logic and ratio regions. It will give a great speedup
for simple.
These numbers also tell us that there's almost nothing to be gained by
better chunk coalescing in the C backend since the dominant cost is
usually MLTON_gc, which is the code that copies C local pointers to
and from an array before and after gc.
----------------------------------------
barnes-hut
1078 ticks total
MLTON_chunkSwitch (magic) 1.39%
----------------------------------------
count-graphs
1226 ticks total
MLTON_chunkSwitch (magic) 2.94%
----------------------------------------
fft
3391 ticks total
MLTON_chunkSwitch (magic) 0.09%
----------------------------------------
knuth-bendix
1791 ticks total
MLTON_chunkSwitch (magic) 5.25%
MLTON_gc (magic) 2.01%
MLTON_endChunkSwitch (magic) 0.73%
MLton_main (magic) 0.45%
----------------------------------------
lexgen
MLTON_gc (magic) 3.79%
MLTON_chunkSwitch (magic) 2.99%
MLton_main (magic) 0.71%
MLTON_endChunkSwitch (magic) 0.37%
----------------------------------------
life
MLTON_chunkSwitch (magic) 0.11%
MLTON_gc (magic) 0.02%
----------------------------------------
logic
MLTON_gc (magic) 12.16%
MLTON_chunkSwitch (magic) 4.65%
MLton_main (magic) 1.96%
MLTON_endChunkSwitch (magic) 1.80%
----------------------------------------
mandelbrot
----------------------------------------
matrix-multiply
----------------------------------------
mlyacc
----------------------------------------
nucleic
MLTON_chunkSwitch (magic) 0.90%
----------------------------------------
ratio-regions
MLTON_gc (magic) 15.51%
MLton_main (magic) 2.88%
MLTON_chunkSwitch (magic) 2.65%
MLTON_endChunkSwitch (magic) 2.33%
----------------------------------------
simple
MLTON_gc (magic) 21.65%
MLTON_chunkSwitch (magic) 8.45%
MLTON_endChunkSwitch (magic) 4.25%
MLton_main (magic) 3.84%
----------------------------------------
tsp
MLTON_chunkSwitch (magic) 0.49%
----------------------------------------
vliw
----------------------------------------
zern
----------------------------------------