SSA simplify passes
Matthew Fluet
fluet@CS.Cornell.EDU
Mon, 7 Jan 2002 19:58:31 -0500 (EST)
Here are the rest of the benchmarks:
MLton0 -- mlton -new-flatten false -tuple-recon-elim 0
MLton1 -- mlton -new-flatten false -tuple-recon-elim 1
MLton2 -- mlton -new-flatten false -tuple-recon-elim 2
MLton3 -- mlton -new-flatten true -tuple-recon-elim 0
MLton4 -- mlton -new-flatten true -tuple-recon-elim 1
MLton5 -- mlton -new-flatten true -tuple-recon-elim 2
compile time
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut 2.56 2.61 2.60 2.55 2.57 2.59
checksum 0.59 0.60 0.59 0.57 0.58 0.60
count-graphs 1.72 1.74 1.75 1.78 1.73 1.75
DLXSimulator 4.54 4.57 4.51 4.63 4.59 4.60
fft 1.30 1.29 1.29 1.28 1.30 1.30
fib 0.55 0.54 0.57 0.54 0.54 0.56
hamlet 56.31 55.36 55.67 53.72 53.70 53.74
imp-for 0.59 0.60 0.58 0.60 0.58 0.60
knuth-bendix 2.22 2.26 2.25 2.25 2.24 2.24
lexgen 5.72 5.72 5.73 5.75 5.74 5.77
life 1.34 1.32 1.34 1.32 1.32 1.32
logic 3.10 3.11 3.13 3.06 3.08 3.09
mandelbrot 0.58 0.61 0.61 0.59 0.57 0.60
matrix-multiply 0.68 0.67 0.69 0.69 0.69 0.65
md5 1.21 1.26 1.26 1.24 1.24 1.26
merge 0.59 0.60 0.63 0.62 0.64 0.60
mlyacc 22.70 23.29 22.48 24.40 22.29 22.31
mpuz 0.79 0.80 0.80 0.80 0.84 0.83
nucleic 2.75 2.76 2.80 2.76 2.74 2.71
peek 0.99 1.00 0.99 0.98 1.01 1.02
psdes-random 0.62 0.63 0.61 0.63 0.62 0.63
ratio-regions 2.55 2.51 2.53 2.57 2.56 2.55
ray 3.60 3.60 3.51 3.56 3.57 3.63
raytrace 9.09 9.07 9.07 10.55 10.54 10.51
simple 7.34 7.29 7.31 7.40 7.39 7.38
smith-normal-form 8.15 8.15 8.17 8.04 8.02 8.04
tailfib 0.55 0.57 0.55 0.55 0.56 0.53
tak 0.55 0.56 0.55 0.58 0.58 0.56
tensor 3.10 3.11 3.08 3.08 3.10 3.08
tsp 1.54 1.52 1.51 1.56 1.53 1.54
tyan 3.82 3.79 3.78 3.87 3.73 3.73
vector-concat 0.62 0.63 0.63 0.65 0.61 0.62
vector-rev 0.57 0.61 0.65 0.58 0.58 0.58
vliw 12.77 12.80 12.80 12.83 12.86 12.61
wc-input1 1.63 1.63 1.66 1.62 1.62 1.62
wc-scanStream 1.66 1.66 1.67 1.67 1.67 1.69
zebra 5.96 5.96 5.96 5.49 5.52 5.96
zern 1.07 1.06 1.06 1.09 1.08 1.09
run time
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut 4.32 4.32 4.33 4.30 4.30 4.30
checksum 3.09 3.09 3.09 3.09 3.09 3.09
count-graphs 4.95 5.07 4.92 4.95 5.03 4.86
DLXSimulator 15.75 15.73 15.71 15.53 15.53 15.55
fft 9.44 9.44 9.41 9.44 9.42 9.42
fib 3.41 3.41 3.41 3.41 3.41 3.41
hamlet 8.20 8.16 8.29 7.22 7.22 7.36
imp-for 8.23 8.23 8.23 8.23 8.23 8.23
knuth-bendix 6.69 6.69 6.53 6.48 6.48 6.48
lexgen 10.79 10.79 10.60 10.81 10.80 10.81
life 6.83 6.87 6.53 6.82 6.84 6.88
logic 20.74 20.74 20.78 18.00 18.03 18.00
mandelbrot 6.20 6.20 6.20 6.20 6.20 6.20
matrix-multiply 3.92 3.94 3.92 3.89 3.94 3.94
md5 2.03 2.03 2.03 2.03 2.03 2.03
merge 49.66 49.77 49.73 49.76 49.69 49.71
mlyacc 9.55 9.40 10.38 9.39 9.38 9.41
mpuz 4.57 4.57 4.57 4.54 4.54 4.54
nucleic 7.70 7.72 7.65 8.29 8.28 8.32
peek 3.26 3.25 3.25 3.25 3.25 3.25
psdes-random 3.36 3.36 3.36 3.36 3.36 3.36
ratio-regions 8.81 8.80 8.80 8.81 8.80 8.81
ray 3.84 3.84 3.84 3.74 3.74 3.73
raytrace 4.77 4.80 4.80 4.97 4.95 4.94
simple 6.01 6.03 6.64 6.07 6.04 6.07
smith-normal-form 0.94 0.95 0.94 0.95 0.94 0.94
tailfib 15.47 15.48 15.48 15.47 15.48 15.47
tak 8.77 8.77 8.77 8.87 8.87 8.87
tensor 5.82 5.82 5.82 5.82 5.82 5.82
tsp 8.76 8.76 8.76 8.77 8.76 8.77
tyan 19.93 17.45 13.30 19.86 17.32 17.34
vector-concat 5.87 5.99 5.86 5.83 5.76 5.77
vector-rev 4.10 4.10 4.13 4.12 4.10 4.11
vliw 6.32 6.30 6.36 6.20 6.17 6.18
wc-input1 1.74 1.73 1.73 1.74 1.74 1.74
wc-scanStream 3.48 3.48 3.48 3.47 3.47 3.47
zebra 2.33 2.36 2.35 2.37 2.35 2.35
zern 35.22 35.31 35.34 35.40 35.25 35.46
run time ratio
benchmark MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut 1.00 1.00 0.99 0.99 1.00
checksum 1.00 1.00 1.00 1.00 1.00
count-graphs 1.02 0.99 1.00 1.02 0.98
DLXSimulator 1.00 1.00 0.99 0.99 0.99
fft 1.00 1.00 1.00 1.00 1.00
fib 1.00 1.00 1.00 1.00 1.00
hamlet 1.00 1.01 0.88 0.88 0.90
imp-for 1.00 1.00 1.00 1.00 1.00
knuth-bendix 1.00 0.98 0.97 0.97 0.97
lexgen 1.00 0.98 1.00 1.00 1.00
life 1.01 0.96 1.00 1.00 1.01
logic 1.00 1.00 0.87 0.87 0.87
mandelbrot 1.00 1.00 1.00 1.00 1.00
matrix-multiply 1.00 1.00 0.99 1.01 1.00
md5 1.00 1.00 1.00 1.00 1.00
merge 1.00 1.00 1.00 1.00 1.00
mlyacc 0.98 1.09 0.98 0.98 0.99
mpuz 1.00 1.00 0.99 0.99 0.99
nucleic 1.00 0.99 1.08 1.07 1.08
peek 1.00 1.00 1.00 1.00 1.00
psdes-random 1.00 1.00 1.00 1.00 1.00
ratio-regions 1.00 1.00 1.00 1.00 1.00
ray 1.00 1.00 0.97 0.97 0.97
raytrace 1.00 1.01 1.04 1.04 1.04
simple 1.00 1.10 1.01 1.00 1.01
smith-normal-form 1.00 1.00 1.00 1.00 1.00
tailfib 1.00 1.00 1.00 1.00 1.00
tak 1.00 1.00 1.01 1.01 1.01
tensor 1.00 1.00 1.00 1.00 1.00
tsp 1.00 1.00 1.00 1.00 1.00
tyan 0.88 0.67 1.00 0.87 0.87
vector-concat 1.02 1.00 0.99 0.98 0.98
vector-rev 1.00 1.01 1.00 1.00 1.00
vliw 1.00 1.01 0.98 0.98 0.98
wc-input1 1.00 1.00 1.00 1.00 1.00
wc-scanStream 1.00 1.00 1.00 1.00 1.00
zebra 1.02 1.01 1.02 1.01 1.01
zern 1.00 1.00 1.01 1.00 1.01
size
benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut 69,200 69,200 69,216 69,312 69,312 69,328
checksum 21,000 21,000 21,000 21,000 21,000 21,000
count-graphs 44,256 44,256 44,040 44,064 44,064 44,064
DLXSimulator 99,296 99,296 99,264 97,904 97,904 97,904
fft 32,484 32,484 32,484 32,484 32,484 32,484
fib 21,000 21,000 21,000 21,000 21,000 21,000
hamlet 1,499,755 1,466,843 1,485,387 1,384,411 1,384,267 1,383,275
imp-for 20,992 20,992 20,992 20,992 20,992 20,992
knuth-bendix 65,529 65,529 65,505 65,185 65,185 65,185
lexgen 157,032 157,032 157,864 157,336 157,336 157,320
life 40,976 40,976 40,880 40,344 40,320 40,320
logic 88,824 88,824 88,824 88,104 88,104 88,104
mandelbrot 20,960 20,960 20,960 20,960 20,960 20,960
matrix-multiply 21,552 21,552 21,552 21,552 21,552 21,552
md5 31,481 31,481 31,481 31,481 31,481 31,481
merge 22,208 22,208 22,208 22,208 22,208 22,208
mlyacc 579,288 574,808 573,976 567,240 567,240 566,072
mpuz 25,976 25,976 25,976 25,840 25,840 25,840
nucleic 63,168 63,168 62,640 62,200 62,200 61,536
peek 31,025 31,025 31,025 29,953 29,953 29,953
psdes-random 21,968 21,968 21,968 21,968 21,968 21,968
ratio-regions 44,128 44,128 44,128 44,128 44,128 44,128
ray 85,259 85,227 85,179 84,459 84,459 84,507
raytrace 204,824 204,712 203,800 298,328 298,216 298,200
simple 194,996 194,244 197,660 198,108 197,996 197,980
smith-normal-form 148,732 148,732 148,764 147,628 147,628 147,660
tailfib 20,672 20,672 20,672 20,672 20,672 20,672
tak 21,120 21,120 21,120 21,120 21,120 21,120
tensor 71,523 71,523 71,523 70,147 70,147 70,163
tsp 37,065 37,065 37,081 37,065 37,065 37,081
tyan 91,897 91,177 89,353 89,369 88,649 88,649
vector-concat 21,808 21,808 21,808 21,808 21,808 21,808
vector-rev 21,624 21,624 21,624 21,624 21,624 21,624
vliw 340,628 340,532 337,124 336,772 336,644 332,468
wc-input1 45,121 45,121 45,137 44,929 44,929 44,961
wc-scanStream 46,417 46,417 46,417 46,273 46,273 46,257
zebra 127,641 127,641 127,641 127,641 127,641 127,641
zern 28,131 28,131 28,131 28,131 28,131 28,131
Not too bad. hamlet and logic suggest that the new flattener is
marginally better, independent of the tuple reconstruction elimination.
On the other hand, raytrace suggests the opposite. Although, this
probably makes sense -- raytrace uses floats everywhere, which are
expensive to move in and out of argument positions (even if it is a
memory-memory move, it's still bounced through the floating point stack);
it probably is better in those cases to pass a single pointer to a tuple
of floats.
No illumination on what's going on with tyan.
Overall, no major slowdowns and a couple of decent improvements. I'm
leaning towards cleaning it up (i.e., drop those options an sticking with
the new flattener and tuple reconstruction elimination always on in the
shrinker) and checking it in. The flattener might benefit from a little
more tweaking.