intraprocedural flattening
Stephen Weeks
MLton@sourcelight.com
Fri, 20 Jul 2001 14:30:58 -0700
> I think
> I will combine the optimization with a forward analysis that only flattens a
> jump argument if only tuples flow into it (in addition to the original
> condition).
I finished the forward analysis. Now a variable of tuple type is flattened iff
it only flows to selects and there is some explicit construction of a tuple that
flows into it. Here are the new benchmark numbers. Now there are no slowdowns,
with the exception of tak, which I am willing to attribute to noise.
run time ratio
benchmark old MLton
barnes-hut 1.0
checksum 1.0
count-graphs 1.2
DLXSimulator 1.0
fft 1.1
fib 1.0
hamlet 1.1
knuth-bendix 1.0
lexgen 1.2
life 1.0
logic 1.2
mandelbrot 1.0
matrix-multiply 1.0
md5 1.0
merge 1.0
mlyacc 1.2
mpuz 1.0
nucleic 1.0
peek 1.1
psdes-random 1.7
ratio-regions 1.0
ray 1.0
raytrace 1.0
simple 1.0
smith-normal-form 1.0
tak 0.9
tensor 1.1
tsp 1.0
vector-concat 2.1
vector-rev 1.0
vliw 1.1
wc-input1 1.0
wc-scanStream 1.6
zebra 1.0
zern 1.0
compile time
benchmark MLton old MLton
barnes-hut 2.7 2.7
checksum 0.8 0.8
count-graphs 1.8 1.9
DLXSimulator 4.5 4.4
fft 1.4 1.4
fib 0.7 0.7
hamlet 51.6 50.6
knuth-bendix 2.5 2.4
lexgen 5.7 5.5
life 1.4 1.4
logic 7.5 7.5
mandelbrot 0.8 0.8
matrix-multiply 0.8 0.8
md5 2.4 2.4
merge 0.8 0.8
mlyacc 19.6 19.5
mpuz 1.0 1.0
nucleic 4.3 4.3
peek 1.1 1.2
psdes-random 0.8 0.8
ratio-regions 3.1 3.1
ray 3.5 3.5
raytrace 9.9 10.0
simple 7.5 7.1
smith-normal-form 7.8 7.8
tak 0.7 0.7
tensor 3.0 3.0
tsp 1.9 1.8
vector-concat 0.8 0.8
vector-rev 0.8 0.8
vliw 12.8 12.2
wc-input1 1.6 1.6
wc-scanStream 1.7 1.7
zebra 5.1 4.8
zern 1.1 1.1
run time
benchmark MLton old MLton
barnes-hut 5.2 5.4
checksum 4.6 4.4
count-graphs 5.9 7.2
DLXSimulator 12.9 13.6
fft 9.5 10.3
fib 4.4 4.4
hamlet 9.8 10.8
knuth-bendix 8.3 8.2
lexgen 13.8 16.5
life 11.1 11.4
logic 26.5 31.0
mandelbrot 8.9 9.0
matrix-multiply 6.2 6.3
md5 5.0 5.1
merge 37.1 38.4
mlyacc 10.9 13.0
mpuz 6.9 7.0
nucleic 8.5 8.5
peek 4.9 5.3
psdes-random 5.8 9.6
ratio-regions 9.4 9.5
ray 6.1 6.0
raytrace 6.4 6.5
simple 7.2 7.2
smith-normal-form 1.1 1.1
tak 11.4 10.7
tensor 8.1 8.8
tsp 12.5 12.8
vector-concat 8.6 17.7
vector-rev 3.7 3.6
vliw 7.1 8.1
wc-input1 3.0 3.0
wc-scanStream 5.2 8.6
zebra 3.3 3.2
zern 44.6 44.7
size
benchmark MLton old MLton
barnes-hut 63,936 65,024
checksum 23,596 24,108
count-graphs 43,380 44,772
DLXSimulator 93,996 93,780
fft 32,160 33,720
fib 23,396 23,924
hamlet 995,775 973,871
knuth-bendix 62,277 62,845
lexgen 130,020 131,428
life 37,452 37,732
logic 149,476 151,820
mandelbrot 23,468 23,988
matrix-multiply 23,932 24,452
md5 36,757 38,269
merge 24,532 25,060
mlyacc 428,228 423,620
mpuz 28,644 29,364
nucleic 60,612 61,132
peek 30,357 31,005
psdes-random 24,428 25,236
ratio-regions 60,212 60,676
ray 72,951 73,751
raytrace 181,876 181,020
simple 170,704 171,616
smith-normal-form 141,540 143,300
tak 23,428 23,948
tensor 68,051 69,091
tsp 39,437 40,149
vector-concat 24,100 24,884
vector-rev 24,028 24,548
vliw 272,624 263,328
wc-input1 40,821 40,981
wc-scanStream 43,453 43,549
zebra 111,717 112,221
zern 27,479 27,759