common-subexpression elimination (cse)
Stephen Weeks
MLton@sourcelight.com
Tue, 24 Jul 2001 10:58:00 -0700
Yesterday, I implemented cse on CPS. In addition to getting the usual sorts of
things like
(i + 1) + (i + 1)
-->
let val i' = i + 1 in i' + i' end
it also gets things like
val a = Array_array n
val b = Array_length a
-->
val a = Array_array n
val b = n
Here are the benchmark results. In them, "MLton" is with cse, "stable MLton" is
without. In summary, no major improvements or problems, except on count-graphs,
which showed a bug (more on that in the next message). It pretty much
universally shrunk code size. Anyways, although it didn't help alot, I think it
is important in enabling other optimizations, like redundant test removal,
overflow detection elimination, and bounds check elimination.
run time ratio
benchmark stable MLton
barnes-hut 1.0
checksum 1.0
count-graphs ~1.0
DLXSimulator 1.0
fft 1.0
fib 1.0
hamlet 1.0
knuth-bendix 1.0
lexgen 1.0
life 1.1
logic 1.0
mandelbrot 0.9
matrix-multiply 1.0
md5 1.0
merge 1.0
mlyacc 1.0
mpuz 1.0
nucleic 1.0
peek 1.0
psdes-random 1.0
ratio-regions 1.0
ray 1.0
raytrace 1.0
simple 1.0
smith-normal-form 1.0
tak 1.0
tensor 0.8
tsp 1.0
vector-concat 1.2
vector-rev 1.1
vliw 1.0
wc-input1 1.0
wc-scanStream 1.0
zebra 1.0
zern 1.0
run time
benchmark MLton stable MLton
barnes-hut 5.2 5.2
checksum 4.4 4.6
count-graphs * 6.0
DLXSimulator 12.5 12.9
fft 9.8 9.9
fib 4.4 4.4
hamlet 9.5 9.8
knuth-bendix 8.6 8.4
lexgen 13.7 13.8
life 10.6 11.2
logic 26.8 26.5
mandelbrot 9.7 9.0
matrix-multiply 6.2 6.2
md5 4.9 5.2
merge 38.4 38.3
mlyacc 10.8 10.9
mpuz 7.0 6.9
nucleic 8.5 8.5
peek 4.7 4.9
psdes-random 5.7 5.8
ratio-regions 9.4 9.3
ray 5.9 6.1
raytrace 6.5 6.4
simple 7.0 7.3
smith-normal-form 1.1 1.1
tak 11.4 11.4
tensor 9.7 8.1
tsp 12.4 12.7
vector-concat 7.5 8.7
vector-rev 3.4 3.7
vliw 7.1 7.0
wc-input1 3.0 3.0
wc-scanStream 5.4 5.2
zebra 3.2 3.2
zern 45.2 45.6
compile time
benchmark MLton stable MLton
barnes-hut 2.7 2.7
checksum 0.8 0.8
count-graphs 1.9 1.9
DLXSimulator 4.5 4.4
fft 1.4 1.4
fib 0.7 0.7
hamlet 52.5 51.1
knuth-bendix 2.5 2.4
lexgen 5.8 5.5
life 1.4 1.4
logic 7.7 7.5
mandelbrot 0.8 0.8
matrix-multiply 0.8 0.8
md5 2.8 2.3
merge 0.8 0.8
mlyacc 20.0 19.4
mpuz 1.0 1.0
nucleic 4.2 4.2
peek 1.2 1.1
psdes-random 0.8 0.8
ratio-regions 3.1 3.1
ray 3.7 3.5
raytrace 10.5 10.1
simple 7.2 7.4
smith-normal-form 8.1 7.8
tak 0.7 0.7
tensor 3.0 2.9
tsp 1.9 1.8
vector-concat 0.8 0.8
vector-rev 0.8 0.8
vliw 12.9 12.5
wc-input1 1.7 1.6
wc-scanStream 1.8 1.7
zebra 5.1 4.9
zern 1.1 1.1
size
benchmark MLton stable MLton
barnes-hut 63,904 64,000
checksum 23,692 23,708
count-graphs 43,468 43,476
DLXSimulator 93,252 93,844
fft 32,144 32,272
fib 23,492 23,508
hamlet 994,647 995,127
knuth-bendix 62,253 62,349
lexgen 130,084 130,220
life 37,100 37,548
logic 149,708 149,564
mandelbrot 23,580 23,580
matrix-multiply 24,028 24,044
md5 36,733 36,765
merge 24,644 24,644
mlyacc 427,916 428,324
mpuz 28,724 28,756
nucleic 60,724 60,724
peek 30,437 30,469
psdes-random 24,524 24,540
ratio-regions 58,724 60,324
ray 72,991 73,063
raytrace 181,116 181,876
simple 156,648 170,784
smith-normal-form 141,028 141,620
tak 23,524 23,540
tensor 66,307 66,691
tsp 39,357 39,485
vector-concat 24,196 24,212
vector-rev 24,044 24,140
vliw 272,064 272,824
wc-input1 40,965 40,933
wc-scanStream 43,613 43,565
zebra 111,789 111,789
zern 27,591 27,575