limit checks
Stephen Weeks
MLton@sourcelight.com
Wed, 24 Oct 2001 22:26:08 -0700
I checked in a minor, but important, tweak to limit check insertion so
that with -limit-check-per-block true, a block is only a candidate for
a limit check it does some allocation. Thus the register allocator
won't move everything into stack slots. This bought back most of the
runtime performance that we lost with -limit-check-per-block true, but
code size is still hurting, especially for a self compile.
Here are the benchmark numbers.
MLton0 -- mlton -limit-check-per-block false
MLton1 -- mlton -limit-check-per-block true
compile time
benchmark MLton0 MLton1
barnes-hut 2.6 2.6
checksum 0.7 0.7
count-graphs 1.9 1.9
DLXSimulator 4.3 4.4
fft 1.3 1.3
fib 0.6 0.6
hamlet 51.0 55.1
knuth-bendix 2.5 2.5
lexgen 5.6 5.8
life 1.4 1.5
logic 6.5 8.7
mandelbrot 0.6 0.6
matrix-multiply 0.7 0.7
md5 1.4 1.4
merge 0.7 0.7
mlyacc 23.3 25.1
mpuz 0.9 1.0
nucleic 3.1 3.1
peek 1.1 1.1
psdes-random 0.7 0.7
ratio-regions 2.8 2.8
ray 3.5 3.6
raytrace 9.6 9.8
simple 6.9 7.4
smith-normal-form 7.9 8.0
tailfib 0.6 0.6
tak 0.6 0.6
tensor 3.1 3.2
tsp 1.6 1.6
tyan 4.0 4.2
vector-concat 0.7 0.7
vector-rev 0.7 0.7
vliw 12.9 13.6
wc-input1 1.7 1.8
wc-scanStream 1.8 1.9
zebra 9.7 10.9
zern 1.1 1.1
run time
benchmark MLton0 MLton1
barnes-hut 5.0 5.3
checksum 3.7 3.7
count-graphs 5.7 5.7
DLXSimulator 15.1 14.5
fft 7.4 7.4
fib 4.6 4.6
hamlet 9.0 9.2
knuth-bendix 8.5 8.5
lexgen 12.2 12.8
life 11.0 10.6
logic 26.6 27.2
mandelbrot 9.4 9.4
matrix-multiply 6.7 6.8
md5 0.6 0.6
merge 39.2 39.7
mlyacc 10.5 10.7
mpuz 6.3 6.4
nucleic 8.4 9.1
peek 4.8 4.8
psdes-random 4.6 4.6
ratio-regions 9.1 9.2
ray 4.9 5.1
raytrace 5.8 6.0
simple 6.7 6.9
smith-normal-form 1.1 1.1
tailfib 22.1 22.1
tak 10.5 10.5
tensor 9.7 9.5
tsp 12.0 11.7
tyan 19.8 20.1
vector-concat 8.0 7.9
vector-rev 3.1 3.2
vliw * 6.8
wc-input1 2.8 2.6
wc-scanStream 4.3 3.9
zebra 2.7 2.8
zern 38.6 38.1
run time ratio
benchmark MLton1
barnes-hut 1.0
checksum 1.0
count-graphs 1.0
DLXSimulator 1.0
fft 1.0
fib 1.0
hamlet 1.0
knuth-bendix 1.0
lexgen 1.0
life 1.0
logic 1.0
mandelbrot 1.0
matrix-multiply 1.0
md5 1.0
merge 1.0
mlyacc 1.0
mpuz 1.0
nucleic 1.1
peek 1.0
psdes-random 1.0
ratio-regions 1.0
ray 1.0
raytrace 1.0
simple 1.0
smith-normal-form 1.0
tailfib 1.0
tak 1.0
tensor 1.0
tsp 1.0
tyan 1.0
vector-concat 1.0
vector-rev 1.0
vliw ~1.0
wc-input1 0.9
wc-scanStream 0.9
zebra 1.0
zern 1.0
size
benchmark MLton0 MLton1
barnes-hut 62,201 64,073
checksum 21,117 21,117
count-graphs 42,853 44,893
DLXSimulator 85,237 88,821
fft 30,337 30,713
fib 21,109 21,109
hamlet 1,058,576 1,285,896
knuth-bendix 64,390 66,422
lexgen 134,493 146,445
life 40,357 42,869
logic 159,405 253,741
mandelbrot 21,077 21,077
matrix-multiply 21,485 21,485
md5 29,454 29,918
merge 22,253 22,309
mlyacc 449,421 526,637
mpuz 27,381 27,805
nucleic 61,333 63,469
peek 29,518 30,734
psdes-random 22,165 22,165
ratio-regions 44,133 45,141
ray 72,144 78,416
raytrace 174,029 197,597
simple 158,953 181,721
smith-normal-form 143,453 144,469
tailfib 20,781 20,781
tak 21,133 21,133
tensor 65,244 67,020
tsp 35,670 35,542
tyan 83,966 89,486
vector-concat 21,789 21,789
vector-rev 21,621 21,613
vliw 290,313 316,433
wc-input1 41,470 43,830
wc-scanStream 44,094 46,542
zebra 122,222 129,526
zern 27,304 27,344