-native-optimize 2 bug
Matthew Fluet
fluet@cs.cornell.edu
Thu, 7 Feb 2002 22:16:07 -0500 (EST)
Just to check my previous claim, I ran the benchmarks with
-native-optimize {0,1,2}. As I thought, there is virtually no difference
between 1 and 2. Disturbingly, there are lots of problems with
-native-optimize 0. I haven't looked into them, but I will in the not too
distant future. While it is clear that doing some x86 optimizations are a
win, I don't like being dependent on optimizations for correctness.
MLton0 -- mlton -native-optimize 0
MLton1 -- mlton -native-optimize 1
MLton2 -- mlton -native-optimize 2
compile time
benchmark MLton0 MLton1 MLton2
barnes-hut 2.22 2.15 2.23
checksum 0.61 0.63 0.62
count-graphs 1.63 1.58 1.64
DLXSimulator 4.20 3.93 4.15
fft 1.22 1.19 1.24
fib 0.59 0.57 0.59
hamlet 45.56 40.75 42.82
imp-for 0.62 0.64 0.62
knuth-bendix 2.05 1.89 1.97
lexgen 5.19 4.81 5.05
life 1.26 1.18 1.21
logic 2.58 2.49 2.62
mandelbrot 0.62 0.63 0.67
matrix-multiply 0.71 0.71 0.72
md5 1.21 1.13 1.17
merge 0.66 0.62 0.64
mlyacc 19.34 17.56 18.51
mpuz 0.83 0.83 0.84
nucleic 2.36 2.33 2.33
peek 0.96 0.97 0.97
psdes-random 0.67 0.65 0.66
ratio-regions 2.34 2.16 2.21
ray 3.18 2.91 3.16
raytrace 9.38 8.89 9.85
simple 6.48 6.05 6.53
smith-normal-form 7.20 7.16 7.28
tailfib 0.60 0.60 0.59
tak 0.60 0.58 0.60
tensor 2.77 2.66 3.12
tsp 1.40 1.37 1.41
tyan 3.58 3.32 3.49
vector-concat 0.67 0.66 0.66
vector-rev 0.64 0.63 0.63
vliw 11.10 10.16 10.85
wc-input1 1.61 1.49 1.57
wc-scanStream 1.66 1.54 1.61
zebra 5.59 5.07 5.46
zern 1.10 1.01 1.06
run time
benchmark MLton0 MLton1 MLton2
barnes-hut * 3.77 3.78
checksum 4.04 3.18 3.18
count-graphs 4.98 3.76 3.82
DLXSimulator * 14.71 14.76
fft 10.03 8.08 8.13
fib 3.91 3.37 3.37
hamlet * 7.04 7.05
imp-for 12.78 7.16 7.16
knuth-bendix * 5.67 5.67
lexgen * 9.19 9.21
life 8.51 6.32 6.30
logic 0.01 17.75 17.75
mandelbrot 9.31 6.06 6.06
matrix-multiply 4.48 2.77 2.77
md5 2.46 1.76 1.77
merge 49.48 48.05 48.13
mlyacc * 8.79 8.82
mpuz 6.67 4.25 4.25
nucleic 8.39 8.02 8.03
peek * 0.82 0.82
psdes-random 4.34 3.20 3.29
ratio-regions 10.93 8.39 8.41
ray * 3.56 3.52
raytrace 0.01 4.88 4.89
simple * 5.85 5.85
smith-normal-form * 0.66 0.66
tailfib 20.37 10.95 10.95
tak 9.88 7.74 7.74
tensor * 3.68 3.68
tsp 9.33 7.51 7.51
tyan * 16.07 16.05
vector-concat 6.17 2.24 2.87
vector-rev 4.14 4.20 4.18
vliw * 5.65 5.65
wc-input1 * 1.93 1.92
wc-scanStream * 2.12 2.12
zebra 2.86 1.80 1.90
zern 39.38 33.42 33.29
run time ratio
benchmark MLton1 MLton2
barnes-hut ~1.00 ~1.00
checksum 0.79 0.79
count-graphs 0.76 0.77
DLXSimulator ~1.00 ~1.00
fft 0.81 0.81
fib 0.86 0.86
hamlet ~1.00 ~1.00
imp-for 0.56 0.56
knuth-bendix ~1.00 ~1.00
lexgen ~1.00 ~1.00
life 0.74 0.74
logic 1604.26 1603.81
mandelbrot 0.65 0.65
matrix-multiply 0.62 0.62
md5 0.72 0.72
merge 0.97 0.97
mlyacc ~1.00 ~1.00
mpuz 0.64 0.64
nucleic 0.96 0.96
peek ~1.00 ~1.00
psdes-random 0.74 0.76
ratio-regions 0.77 0.77
ray ~1.00 ~1.00
raytrace 432.61 433.63
simple ~1.00 ~1.00
smith-normal-form ~1.00 ~1.00
tailfib 0.54 0.54
tak 0.78 0.78
tensor ~1.00 ~1.00
tsp 0.81 0.81
tyan ~1.00 ~1.00
vector-concat 0.36 0.47
vector-rev 1.01 1.01
vliw ~1.00 ~1.00
wc-input1 ~1.00 ~1.00
wc-scanStream ~1.00 ~1.00
zebra 0.63 0.66
zern 0.85 0.85
size
benchmark MLton0 MLton1 MLton2
barnes-hut 64,996 57,604 57,604
checksum 24,225 23,809 23,809
count-graphs 51,937 44,385 44,353
DLXSimulator 106,665 88,137 88,105
fft 37,393 33,777 33,745
fib 24,161 23,873 23,873
hamlet 1,399,976 1,102,568 1,102,472
imp-for 24,193 23,841 23,841
knuth-bendix 74,898 64,306 64,306
lexgen 174,513 146,769 146,737
life 43,969 40,641 40,609
logic 87,073 80,897 80,897
mandelbrot 24,385 23,937 23,937
matrix-multiply 25,217 24,513 24,513
md5 38,226 33,458 33,458
merge 25,537 25,057 25,057
mlyacc 553,329 468,401 468,401
mpuz 30,305 28,161 28,161
nucleic 65,537 62,913 62,913
peek 35,250 31,922 31,922
psdes-random 25,313 24,609 24,609
ratio-regions 59,137 45,025 45,025
ray 100,360 83,176 83,176
raytrace 273,557 233,973 233,941
simple 206,505 180,425 180,457
smith-normal-form 148,564 138,676 138,644
tailfib 23,841 23,553 23,553
tak 24,257 23,969 23,969
tensor 67,123 56,755 56,435
tsp 45,746 38,738 38,738
tyan 103,570 84,786 84,786
vector-concat 25,217 24,513 24,481
vector-rev 24,993 24,481 24,481
vliw 349,633 290,977 290,881
wc-input1 55,754 47,306 47,242
wc-scanStream 56,842 48,266 48,202
zebra 137,778 113,842 113,842
zern 32,720 30,000 29,968