limit checks

Stephen Weeks MLton@sourcelight.com
Wed, 24 Oct 2001 22:26:08 -0700


I checked in a minor, but important, tweak to limit check insertion so
that with -limit-check-per-block true, a block is only a candidate for
a limit check it does some allocation.  Thus the register allocator
won't move everything into stack slots.  This bought back most of the
runtime performance that we lost with -limit-check-per-block true, but
code size is still hurting, especially for a self compile.

Here are the benchmark numbers.

MLton0 -- mlton -limit-check-per-block false
MLton1 -- mlton -limit-check-per-block true
compile time
benchmark         MLton0 MLton1
barnes-hut           2.6    2.6
checksum             0.7    0.7
count-graphs         1.9    1.9
DLXSimulator         4.3    4.4
fft                  1.3    1.3
fib                  0.6    0.6
hamlet              51.0   55.1
knuth-bendix         2.5    2.5
lexgen               5.6    5.8
life                 1.4    1.5
logic                6.5    8.7
mandelbrot           0.6    0.6
matrix-multiply      0.7    0.7
md5                  1.4    1.4
merge                0.7    0.7
mlyacc              23.3   25.1
mpuz                 0.9    1.0
nucleic              3.1    3.1
peek                 1.1    1.1
psdes-random         0.7    0.7
ratio-regions        2.8    2.8
ray                  3.5    3.6
raytrace             9.6    9.8
simple               6.9    7.4
smith-normal-form    7.9    8.0
tailfib              0.6    0.6
tak                  0.6    0.6
tensor               3.1    3.2
tsp                  1.6    1.6
tyan                 4.0    4.2
vector-concat        0.7    0.7
vector-rev           0.7    0.7
vliw                12.9   13.6
wc-input1            1.7    1.8
wc-scanStream        1.8    1.9
zebra                9.7   10.9
zern                 1.1    1.1
run time
benchmark         MLton0 MLton1
barnes-hut           5.0    5.3
checksum             3.7    3.7
count-graphs         5.7    5.7
DLXSimulator        15.1   14.5
fft                  7.4    7.4
fib                  4.6    4.6
hamlet               9.0    9.2
knuth-bendix         8.5    8.5
lexgen              12.2   12.8
life                11.0   10.6
logic               26.6   27.2
mandelbrot           9.4    9.4
matrix-multiply      6.7    6.8
md5                  0.6    0.6
merge               39.2   39.7
mlyacc              10.5   10.7
mpuz                 6.3    6.4
nucleic              8.4    9.1
peek                 4.8    4.8
psdes-random         4.6    4.6
ratio-regions        9.1    9.2
ray                  4.9    5.1
raytrace             5.8    6.0
simple               6.7    6.9
smith-normal-form    1.1    1.1
tailfib             22.1   22.1
tak                 10.5   10.5
tensor               9.7    9.5
tsp                 12.0   11.7
tyan                19.8   20.1
vector-concat        8.0    7.9
vector-rev           3.1    3.2
vliw                   *    6.8
wc-input1            2.8    2.6
wc-scanStream        4.3    3.9
zebra                2.7    2.8
zern                38.6   38.1
run time ratio
benchmark         MLton1
barnes-hut           1.0
checksum             1.0
count-graphs         1.0
DLXSimulator         1.0
fft                  1.0
fib                  1.0
hamlet               1.0
knuth-bendix         1.0
lexgen               1.0
life                 1.0
logic                1.0
mandelbrot           1.0
matrix-multiply      1.0
md5                  1.0
merge                1.0
mlyacc               1.0
mpuz                 1.0
nucleic              1.1
peek                 1.0
psdes-random         1.0
ratio-regions        1.0
ray                  1.0
raytrace             1.0
simple               1.0
smith-normal-form    1.0
tailfib              1.0
tak                  1.0
tensor               1.0
tsp                  1.0
tyan                 1.0
vector-concat        1.0
vector-rev           1.0
vliw                ~1.0
wc-input1            0.9
wc-scanStream        0.9
zebra                1.0
zern                 1.0
size
benchmark            MLton0    MLton1
barnes-hut           62,201    64,073
checksum             21,117    21,117
count-graphs         42,853    44,893
DLXSimulator         85,237    88,821
fft                  30,337    30,713
fib                  21,109    21,109
hamlet            1,058,576 1,285,896
knuth-bendix         64,390    66,422
lexgen              134,493   146,445
life                 40,357    42,869
logic               159,405   253,741
mandelbrot           21,077    21,077
matrix-multiply      21,485    21,485
md5                  29,454    29,918
merge                22,253    22,309
mlyacc              449,421   526,637
mpuz                 27,381    27,805
nucleic              61,333    63,469
peek                 29,518    30,734
psdes-random         22,165    22,165
ratio-regions        44,133    45,141
ray                  72,144    78,416
raytrace            174,029   197,597
simple              158,953   181,721
smith-normal-form   143,453   144,469
tailfib              20,781    20,781
tak                  21,133    21,133
tensor               65,244    67,020
tsp                  35,670    35,542
tyan                 83,966    89,486
vector-concat        21,789    21,789
vector-rev           21,621    21,613
vliw                290,313   316,433
wc-input1            41,470    43,830
wc-scanStream        44,094    46,542
zebra               122,222   129,526
zern                 27,304    27,344