unused-args
Matthew Fluet
mfluet@intertrust.com
Sat, 21 Jul 2001 18:39:34 -0700 (PDT)
Here are the benchmark results for the unused args optimization.
MLton is the new codegen (tracking stack slots) with both the local
flatten and unused args optimizations.
old MLton is the new codegen (tracking stack slots) with the local flatten
optimization, but the unused args optimization turned off.
stable MLton is essentially the release (the release compiler with a
slightly updated basis library.)
So, the ratio with old MLton shows (surprisingly) that removing thos
unused args can be a win. It didn't hurt anything too badly, but I'm
surprised that things went down at all. The ratio with old MLton shows
that even with these two new optimizations, the live stack slot codegen
isn't quite up to snuff. And the compile times shows that the new codegen
is still way to slow. And the sizes are bigger with the new codegen too.
benchmark stable MLton old MLton
barnes-hut 1.1 1.0
checksum 1.0 1.1
count-graphs 1.0 1.1
DLXSimulator 1.0 1.0
fft 1.1 1.1
fib 0.9 0.9
hamlet 1.1 1.1
knuth-bendix 1.0 0.9
lexgen 1.2 1.2
life 0.8 1.0
logic 1.2 1.1
mandelbrot 1.1 1.0
matrix-multiply 1.2 1.0
md5 0.9 1.0
merge 1.3 1.0
mlyacc 1.2 1.2
mpuz 0.9 1.0
nucleic 1.0 1.0
peek 0.8 0.9
psdes-random 1.6 1.7
ratio-regions 0.9 1.0
ray 0.9 1.0
raytrace 1.1 1.0
simple 0.9 1.0
smith-normal-form 1.0 1.0
tak 0.9 0.9
tensor 1.1 1.0
tsp 1.0 1.0
vector-concat 2.0 2.1
vector-rev 0.9 1.0
vliw 1.2 1.1
wc-input1 0.7 1.0
wc-scanStream 1.4 1.5
zebra 0.8 1.0
zern 1.1 1.0
run time
benchmark MLton stable MLton old MLton
barnes-hut 5.0 5.4 5.2
checksum 4.5 4.4 4.8
count-graphs 7.2 7.2 8.3
DLXSimulator 13.0 13.6 13.4
fft 9.0 10.2 9.6
fib 4.6 4.4 4.1
hamlet 9.8 10.8 11.0
knuth-bendix 8.5 8.4 7.9
lexgen 14.5 16.7 17.1
life 13.5 11.4 13.7
logic 26.3 31.0 29.3
mandelbrot 8.2 9.0 8.2
matrix-multiply 5.2 6.2 5.1
md5 5.6 5.2 5.8
merge 36.6 47.7 38.3
mlyacc 10.7 13.3 13.1
mpuz 7.4 6.9 7.6
nucleic 8.5 8.5 8.6
peek 6.6 5.2 6.0
psdes-random 5.8 9.3 9.9
ratio-regions 10.3 9.5 10.2
ray 6.6 5.9 6.3
raytrace 5.9 6.3 6.0
simple 7.9 7.2 7.9
smith-normal-form 1.1 1.1 1.1
tak 12.2 10.7 11.5
tensor 8.1 8.7 7.7
tsp 13.0 12.6 12.9
vector-concat 9.4 19.1 19.5
vector-rev 4.1 3.7 4.1
vliw 6.9 8.2 7.8
wc-input1 3.9 2.7 4.1
wc-scanStream 6.1 8.6 9.4
zebra 4.0 3.2 4.0
zern 40.9 44.5 41.0
compile time
benchmark MLton stable MLton old MLton
barnes-hut 3.2 2.8 3.2
checksum 0.8 0.8 0.8
count-graphs 2.6 1.9 2.5
DLXSimulator 7.7 4.6 7.8
fft 2.1 1.5 1.9
fib 0.8 0.7 0.8
hamlet 59.3 51.7 58.3
knuth-bendix 2.9 2.4 3.0
lexgen 8.6 5.5 8.5
life 1.6 1.4 1.7
logic 8.5 7.4 8.5
mandelbrot 0.8 0.8 0.8
matrix-multiply 0.8 0.8 0.9
md5 3.4 2.5 3.5
merge 0.8 0.8 0.8
mlyacc 53.8 19.4 53.1
mpuz 1.1 1.0 1.1
nucleic 3.8 4.4 3.8
peek 1.3 1.2 1.3
psdes-random 0.9 0.8 0.9
ratio-regions 4.8 3.2 4.8
ray 5.4 3.6 5.4
raytrace 13.2 9.9 12.9
simple 12.2 7.4 12.6
smith-normal-form 8.4 8.0 8.6
tak 0.8 0.8 0.8
tensor 4.0 3.1 4.1
tsp 1.9 1.9 2.0
vector-concat 0.9 0.8 0.9
vector-rev 0.8 0.8 0.8
vliw 17.9 12.4 17.3
wc-input1 2.1 1.6 2.0
wc-scanStream 2.3 1.7 2.3
zebra 14.1 4.9 14.2
zern 1.3 1.1 1.3
size
benchmark MLton stable MLton old MLton
barnes-hut 67,456 64,618 68,976
checksum 23,804 23,666 24,556
count-graphs 48,164 44,346 49,876
DLXSimulator 112,380 93,290 112,180
fft 35,440 33,330 36,248
fib 23,572 23,482 24,340
hamlet 1,081,663 974,905 1,069,007
knuth-bendix 67,717 60,579 68,525
lexgen 147,884 130,994 149,460
life 39,500 37,274 39,780
logic 153,580 151,378 156,332
mandelbrot 23,628 23,554 24,372
matrix-multiply 24,188 24,018 24,932
md5 39,373 37,827 41,645
merge 24,852 24,618 25,620
mlyacc 503,980 423,186 497,532
mpuz 29,252 28,946 30,340
nucleic 62,980 60,698 63,724
peek 31,541 30,587 32,541
psdes-random 24,652 24,802 25,812
ratio-regions 66,580 60,274 67,316
ray 86,063 72,641 87,199
raytrace 203,764 179,262 202,764
simple 207,576 171,178 208,840
smith-normal-form 148,292 142,874 150,804
tak 23,700 23,514 24,444
tensor 71,819 68,617 73,155
tsp 40,845 39,707 41,845
vector-concat 24,372 24,442 25,508
vector-rev 24,300 24,106 25,060
vliw 317,744 262,938 307,072
wc-input1 44,613 40,563 44,869
wc-scanStream 48,061 43,107 48,237
zebra 153,677 111,779 154,397
zern 28,487 27,337 28,847