[MLton] Question on profile.fun
Matthew Fluet
fluet@cs.cornell.edu
Sun, 5 Jun 2005 14:09:05 -0400 (EDT)
> So, I admit that the bound on code-insertion for time-profiling isn't all
> that good. Again, we might argue that this is an upper-bound, since the
> inserted code is more expensive than a single move. I'll see about
> queueing up that experiment.
MLton0 -- mlton -profile no
MLton1 -- mlton -profile drop
MLton2 -- mlton -profile drop -profile-dummy mov
MLton3 -- mlton -profile drop -profile-dummy inc
# MLton3/MLton2 <=
-- ----------------
17 1.0
35 1.1
38 1.2
outliers 1.22 life, 1.30 imp-for, 1.31 tailfib, 1.35 peek
So, for the most part, a mov vs. an inc is a wash. The exceptions is
where there is a tight loop that entails lots of updates to the gcState,
in which case the mov does do better. All in all, no surprises, and yet
further evidence that code insertion is too intrusive for time profiling.
MLton0 -- mlton -profile no
MLton1 -- mlton -profile drop
MLton2 -- mlton -profile drop -profile-dummy mov
MLton3 -- mlton -profile drop -profile-dummy inc
MLton4 -- mlton -profile label
MLton5 -- mlton -profile label -profile-dummy mov
MLton6 -- mlton -profile label -profile-dummy inc
MLton7 -- mlton -profile time
MLton8 -- mlton -profile time -profile-dummy mov
MLton9 -- mlton -profile time -profile-dummy inc
run time ratio
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut 1.00 1.06 1.10 1.09 1.07 1.10 1.06 1.03 1.02 1.04
boyer 1.00 1.09 1.12 1.10 1.09 1.16 1.14 1.00 1.11 1.12
checksum 1.00 1.70 1.84 1.75 1.68 1.80 1.75 1.68 1.84 1.75
count-graphs 1.00 1.03 1.19 1.15 1.09 1.16 1.18 1.08 1.17 1.18
DLXSimulator 1.00 1.00 1.03 1.04 0.99 1.03 0.90 0.96 0.98 1.05
fft 1.00 0.99 1.00 1.04 1.03 1.06 1.04 1.08 1.11 1.11
fib 1.00 1.36 1.41 1.44 1.41 1.49 1.50 1.39 1.59 1.51
flat-array 1.00 1.00 1.06 1.06 1.06 0.97 1.09 0.86 0.89 0.91
hamlet 1.00 1.06 1.11 1.12 1.11 1.14 1.15 1.12 1.16 1.17
imp-for 1.00 1.00 1.81 2.35 1.00 1.46 2.32 0.99 1.45 2.32
knuth-bendix 1.00 1.20 1.34 1.41 1.25 1.27 1.34 1.25 1.28 1.34
lexgen 1.00 1.05 1.10 1.11 1.08 1.10 1.15 1.06 1.08 1.13
life 1.00 1.15 1.41 1.72 1.31 1.60 1.79 1.35 1.64 1.83
logic 1.00 1.02 1.08 1.06 1.04 1.10 1.10 1.05 1.11 1.11
mandelbrot 1.00 1.04 1.09 1.12 1.04 1.15 1.11 0.68 0.71 0.77
matrix-multiply 1.00 0.99 1.11 1.13 0.98 1.10 1.12 0.89 1.18 1.25
md5 1.00 1.40 1.57 1.62 1.45 1.67 1.83 1.45 1.67 1.83
merge 1.00 1.00 1.03 1.03 1.00 1.03 1.03 1.01 1.04 1.04
mlyacc 1.00 1.03 1.05 1.06 1.04 1.08 1.08 1.05 1.08 1.09
model-elimination 1.00 1.05 1.11 1.08 1.07 1.12 1.11 1.08 1.13 1.12
mpuz 1.00 1.00 1.24 1.33 1.02 1.26 1.33 1.02 1.26 1.34
nucleic 1.00 0.99 1.01 1.01 0.99 1.01 1.02 1.08 1.02 1.02
output1 1.00 0.97 1.06 1.18 0.97 1.06 1.18 0.97 1.06 1.18
peek 1.00 1.25 1.50 2.02 1.26 1.50 2.00 1.25 1.50 2.00
psdes-random 1.00 1.10 1.38 1.60 1.10 1.36 1.59 1.10 1.41 1.63
ratio-regions 1.00 1.13 1.24 1.26 1.16 1.27 1.31 1.17 1.29 1.32
ray 1.00 1.07 1.14 1.09 1.07 1.10 1.12 1.05 1.08 1.12
raytrace 1.00 1.03 1.14 1.10 1.05 1.17 1.12 1.06 1.17 1.13
simple 1.00 1.00 1.11 1.13 1.02 1.12 1.15 0.94 1.05 1.07
smith-normal-form 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.01 1.01
tailfib 1.00 0.96 1.16 1.52 0.96 1.16 1.52 0.96 1.16 1.52
tak 1.00 1.36 1.32 1.37 1.44 1.39 1.47 1.43 1.39 1.46
tensor 1.00 0.84 1.25 1.49 0.84 1.25 1.49 0.84 1.25 1.49
tsp 1.00 1.01 1.03 1.04 1.03 1.03 1.04 1.03 1.03 1.04
tyan 1.00 1.07 1.11 1.11 1.08 1.12 1.12 1.09 1.12 1.12
vector-concat 1.00 1.01 1.01 1.01 1.01 1.01 1.01 1.02 1.03 1.11
vector-rev 1.00 0.66 0.65 0.64 0.61 0.67 0.84 0.69 0.64 0.65
vliw 1.00 1.09 1.13 1.08 1.24 1.19 1.14 1.10 1.21 1.16
wc-input1 1.00 1.24 1.28 1.38 1.20 1.21 1.42 1.18 1.19 1.41
wc-scanStream 1.00 8.45 8.35 8.44 8.16 7.26 7.29 7.14 7.23 7.30
zebra 1.00 1.05 1.50 1.64 1.29 1.50 1.58 1.27 1.52 1.62
zern 1.00 1.02 1.52 1.24 1.10 1.36 1.60 1.10 1.20 1.23
size
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut 99,700 120,058 125,594 122,938 145,314 150,690 148,194 149,084 154,368 151,976
boyer 135,375 162,147 177,267 171,379 206,323 221,539 215,699 213,565 228,749 222,925
checksum 50,095 52,627 52,979 52,835 55,755 56,107 55,979 62,725 63,061 62,949
count-graphs 63,135 76,263 82,239 78,911 93,783 99,719 96,423 102,129 107,777 104,657
DLXSimulator 126,067 165,179 179,371 172,091 216,299 229,819 222,707 221,613 235,117 228,021
fft 61,358 67,718 70,662 68,934 75,982 78,878 77,150 84,232 87,176 85,384
fib 44,691 47,047 47,399 47,207 50,143 50,495 50,351 57,129 57,481 57,337
flat-array 44,715 46,999 47,287 47,159 50,143 50,415 50,303 57,113 57,369 57,273
hamlet 1,246,854 1,913,294 2,204,158 2,061,582 2,699,934 2,996,302 2,858,062 2,704,854 3,001,174 2,862,966
imp-for 44,547 47,711 48,767 48,159 52,551 53,623 53,031 59,457 60,545 59,937
knuth-bendix 105,907 133,055 142,551 137,823 165,595 175,003 170,571 172,517 181,861 177,637
lexgen 199,332 270,180 296,204 283,412 358,500 384,420 372,220 361,020 386,956 374,724
life 62,059 71,295 74,711 72,823 85,407 88,799 86,879 91,801 95,225 93,305
logic 103,567 133,503 146,847 138,303 172,679 186,311 177,751 179,857 193,441 184,913
mandelbrot 44,643 47,055 47,471 47,295 50,215 50,599 50,439 57,153 57,537 57,393
matrix-multiply 46,294 49,750 51,062 50,166 54,414 55,742 54,846 61,200 62,496 61,632
md5 74,531 84,991 87,775 85,647 97,431 100,215 98,151 103,953 106,593 104,721
merge 46,271 49,279 49,855 49,599 53,255 53,815 53,575 60,161 60,721 60,497
mlyacc 501,140 696,352 776,376 733,048 933,496 1,013,704 971,240 935,984 1,016,192 973,728
model-elimination 631,901 884,693 987,165 933,909 1,236,233 1,338,401 1,286,281 1,241,211 1,343,459 1,291,339
mpuz 47,307 53,543 57,159 54,695 60,927 64,495 62,015 69,345 72,081 70,289
nucleic 196,246 208,546 211,250 209,810 221,650 224,370 223,026 228,420 231,124 229,812
output1 77,373 84,953 86,001 85,561 97,689 98,769 98,377 99,465 100,593 100,169
peek 73,483 81,495 82,607 82,231 92,551 93,695 93,335 97,649 98,665 98,321
psdes-random 45,355 48,407 49,047 48,727 52,255 52,895 52,591 59,177 59,817 59,513
ratio-regions 70,387 102,063 116,223 108,191 126,567 140,599 132,791 135,497 149,577 141,769
ray 178,284 226,868 237,900 231,620 291,556 302,876 296,644 298,444 309,796 303,532
raytrace 260,497 326,209 361,697 342,833 440,625 476,065 457,457 441,497 477,001 458,473
simple 219,103 302,895 339,863 317,871 379,215 416,335 394,463 383,857 421,137 399,249
smith-normal-form 178,867 196,427 204,379 199,259 220,459 228,411 223,275 225,973 233,957 228,837
tailfib 44,387 46,679 47,015 46,855 49,727 50,063 49,919 56,681 57,001 56,873
tak 44,771 46,943 47,215 47,071 49,959 50,215 50,119 56,993 57,249 57,153
tensor 94,850 117,762 131,282 123,322 156,514 169,314 161,466 161,940 174,740 166,900
tsp 79,059 90,295 94,231 91,895 105,999 109,967 107,679 112,377 116,233 114,121
tyan 132,123 175,379 193,723 182,851 232,427 250,931 240,331 238,141 256,637 246,101
vector-concat 45,971 48,375 48,567 48,503 52,047 52,239 52,191 59,001 59,177 59,145
vector-rev 45,199 47,863 48,199 48,039 51,311 51,647 51,503 58,313 58,633 58,505
vliw 387,187 668,207 744,447 706,111 916,471 993,567 955,039 919,071 996,151 957,623
wc-input1 99,071 115,083 116,979 116,139 138,827 140,659 140,027 140,963 142,795 142,163
wc-scanStream 106,215 123,483 125,523 124,619 151,215 153,175 152,495 153,415 155,351 154,695
zebra 121,515 157,363 185,627 174,371 264,243 289,723 277,699 270,589 295,941 284,125
zern 85,796 94,308 97,908 95,460 106,444 110,060 107,724 115,222 118,918 116,534
compile time
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut 9.90 10.10 11.46 11.44 11.59 11.37 10.38 11.63 11.87 11.19
boyer 10.58 11.22 12.16 12.23 12.05 11.50 12.40 12.11 11.48 11.86
checksum 7.55 7.68 7.31 7.71 7.72 7.64 7.10 7.36 7.93 6.75
count-graphs 7.81 8.85 7.90 8.49 8.86 9.17 9.25 9.34 8.18 9.48
DLXSimulator 11.94 12.59 12.02 13.10 12.87 13.94 12.29 12.01 12.30 12.64
fft 8.15 7.29 7.26 7.34 7.49 7.52 7.75 7.83 7.88 7.95
fib 6.79 7.20 7.28 7.12 7.49 7.87 6.86 7.42 7.60 7.87
flat-array 7.04 7.14 6.98 7.09 6.85 7.43 7.03 6.59 6.62 6.70
hamlet 62.42 75.16 77.19 79.42 86.85 89.46 91.90 87.08 89.36 92.12
imp-for 6.26 6.41 6.40 6.59 7.78 7.79 6.50 6.66 6.65 6.70
knuth-bendix 8.14 8.85 10.88 11.07 11.29 9.33 9.34 9.41 9.45 9.47
lexgen 11.71 13.20 13.44 13.56 14.32 14.48 14.69 14.23 14.50 14.68
life 6.94 7.16 7.21 7.19 7.28 7.26 7.33 7.45 7.45 7.49
logic 8.37 8.93 9.08 9.21 9.38 9.55 9.68 9.64 9.68 9.87
mandelbrot 6.27 6.41 6.51 6.46 6.47 6.54 6.53 6.79 6.65 6.61
matrix-multiply 6.39 6.58 6.60 6.58 6.64 6.59 6.60 6.82 6.80 6.80
md5 7.15 7.60 7.60 7.55 7.70 7.77 7.71 7.92 7.86 7.89
merge 6.33 6.46 6.46 6.44 6.50 6.53 6.51 6.69 6.65 6.71
mlyacc 27.03 31.69 32.33 32.77 34.92 35.74 36.21 35.03 35.71 36.26
model-elimination 29.45 35.32 36.12 36.90 40.26 41.08 41.59 40.47 41.21 41.89
mpuz 6.41 6.62 6.64 6.61 6.74 6.72 6.94 6.86 6.96 6.92
nucleic 13.84 14.11 14.14 14.33 14.25 14.18 14.32 14.49 14.62 14.54
output1 7.13 7.41 7.46 7.44 7.55 7.61 7.60 7.54 7.61 7.64
peek 6.99 7.37 7.48 7.42 9.20 7.50 7.53 7.64 7.63 7.63
psdes-random 6.29 6.48 6.48 6.50 6.54 6.58 6.55 6.73 6.73 6.73
ratio-regions 7.61 8.21 8.35 8.40 8.52 8.60 8.75 8.80 8.90 8.96
ray 10.38 11.85 11.95 12.01 12.64 12.74 12.81 12.88 12.97 12.98
raytrace 14.89 16.56 16.82 17.09 18.02 18.34 18.51 17.99 18.30 18.66
simple 12.26 13.56 13.84 14.05 14.53 14.81 14.96 14.61 14.88 15.22
smith-normal-form 10.56 11.19 11.38 11.43 11.54 11.68 11.71 11.75 11.73 11.79
tailfib 6.24 6.37 6.38 6.38 6.45 6.61 6.45 6.66 6.61 6.58
tak 6.19 6.39 6.39 6.42 6.41 6.40 6.39 6.59 6.61 6.65
tensor 8.94 9.97 10.01 10.06 10.34 10.42 10.57 10.51 10.58 10.72
tsp 7.43 7.96 8.03 8.01 8.11 8.16 8.17 8.29 8.34 8.37
tyan 9.75 10.94 11.16 11.27 11.72 11.84 12.01 11.87 11.97 12.12
vector-concat 6.34 6.51 6.49 6.48 6.51 6.61 6.57 6.68 6.70 6.76
vector-rev 9.72 9.96 8.36 6.44 6.49 6.47 7.93 8.12 6.70 6.64
vliw 20.52 27.12 27.67 28.16 30.44 30.98 31.46 30.50 31.12 31.47
wc-input1 7.92 8.55 8.57 8.53 8.84 8.89 8.86 8.86 8.87 8.93
wc-scanStream 8.25 8.87 10.73 8.87 11.11 9.25 9.24 9.25 9.31 9.36
zebra 9.34 10.30 10.55 10.70 11.27 11.46 11.68 11.50 11.67 11.85
zern 6.98 7.50 9.02 8.05 8.27 7.98 11.08 10.93 8.31 8.39
run time
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut 57.24 60.61 62.89 62.36 61.42 62.92 60.94 59.01 58.45 59.28
boyer 60.94 66.68 68.31 66.93 66.57 70.50 69.49 60.80 67.46 68.27
checksum 106.87 181.15 196.39 186.62 180.07 192.27 186.60 179.62 196.48 187.28
count-graphs 48.46 50.00 57.75 55.52 52.84 55.99 57.42 52.16 56.59 57.20
DLXSimulator 102.96 103.28 105.92 107.19 102.11 106.13 93.10 99.22 100.93 108.54
fft 37.00 36.63 36.91 38.31 37.95 39.35 38.47 39.79 40.98 41.12
fib 70.90 96.63 100.12 101.97 100.29 105.91 106.08 98.24 112.69 107.40
flat-array 31.95 32.09 34.01 33.88 33.92 31.11 34.87 27.55 28.50 29.20
hamlet 53.17 56.22 58.79 59.48 59.18 60.61 61.39 59.74 61.93 62.06
imp-for 46.97 46.98 85.03 110.53 46.83 68.45 108.97 46.38 68.09 109.04
knuth-bendix 38.73 46.54 52.05 54.80 48.51 49.22 51.70 48.41 49.48 52.00
lexgen 44.58 46.75 49.08 49.35 47.95 48.83 51.13 47.11 48.08 50.28
life 14.49 16.59 20.44 24.95 19.01 23.15 25.91 19.51 23.73 26.55
logic 55.79 56.95 60.01 59.13 58.19 61.53 61.38 58.51 61.79 61.70
mandelbrot 82.74 86.12 90.43 92.88 86.12 94.78 91.62 56.24 58.49 63.65
matrix-multiply 7.68 7.58 8.54 8.64 7.52 8.46 8.60 6.84 9.08 9.57
md5 53.40 74.92 83.85 86.62 77.61 89.20 97.46 77.68 89.22 97.62
merge 85.59 85.54 87.99 87.91 85.50 88.50 88.30 86.03 88.79 88.94
mlyacc 41.05 42.13 43.13 43.40 42.88 44.16 44.38 43.09 44.42 44.59
model-elimination 80.34 84.06 88.83 87.13 86.02 90.37 89.34 86.76 90.58 90.36
mpuz 41.98 41.92 51.95 55.95 42.99 52.85 55.96 42.91 52.93 56.09
nucleic 45.62 45.22 46.11 46.27 45.03 46.25 46.45 49.45 46.40 46.61
output1 15.28 14.89 16.21 18.01 14.88 16.21 18.00 14.89 16.23 18.01
peek 35.59 44.45 53.36 71.85 44.88 53.34 71.16 44.46 53.37 71.16
psdes-random 39.00 43.00 54.00 62.33 43.01 53.02 62.05 43.04 55.07 63.41
ratio-regions 53.19 60.09 65.78 67.26 61.65 67.79 69.45 61.97 68.71 70.11
ray 30.81 32.93 35.10 33.53 32.89 33.88 34.41 32.33 33.36 34.38
raytrace 42.70 44.12 48.52 46.96 44.87 49.77 47.67 45.23 50.00 48.09
simple 60.58 60.83 67.15 68.24 61.94 68.14 69.51 57.09 63.34 64.72
smith-normal-form 37.07 37.20 37.17 37.16 37.13 37.17 37.16 37.37 37.41 37.33
tailfib 43.94 41.98 50.87 66.67 41.97 50.88 66.71 42.01 50.91 66.75
tak 27.71 37.69 36.51 38.05 39.88 38.44 40.60 39.57 38.49 40.59
tensor 59.18 49.48 73.70 88.37 49.51 73.75 88.41 49.53 73.75 88.39
tsp 64.71 65.23 66.66 67.07 66.74 66.73 67.19 66.75 66.89 67.27
tyan 57.99 62.09 64.14 64.23 62.60 64.83 64.93 63.20 65.14 65.23
vector-concat 91.58 92.20 92.65 92.22 92.64 92.65 92.42 93.48 94.30 101.66
vector-rev 192.95 126.75 125.78 123.88 117.43 129.67 162.24 133.67 123.70 125.50
vliw 53.31 57.85 60.18 57.35 66.30 63.35 60.80 58.73 64.36 61.87
wc-input1 38.84 48.06 49.89 53.64 46.67 46.92 55.30 45.84 46.05 54.65
wc-scanStream 34.20 289.13 285.63 288.64 279.21 248.43 249.22 244.16 247.24 249.64
zebra 40.66 42.88 60.91 66.80 52.33 61.03 64.26 51.81 61.64 65.76
zern 41.36 42.37 62.74 51.16 45.63 56.37 66.07 45.30 49.46 50.87