[MLton-devel] cvs commit: types for Rssa
Stephen Weeks
MLton@mlton.org
Sun, 8 Dec 2002 17:18:37 -0800
> I was originally going to ask what was wrong with doing the cast as part
> of the case statment, in that the argument of the destination label would
> rebind the test variable at the new type. But, I see now that that the
> straightforward translation into Machine would require a move from the
> test variable to the argument, which is really redundant.
Right. So what actually happens is there is a move (with a cast) in
the nullary label that is the destination of the case branch. I had
initially tried leaving that move as an implicit part of the switch
(something like arith) even in MACHINE but that complicated the
codegens. So I figured it was easier to make the move explicit and to
someday fix the type checker to notice (via dominators) that the cast
is ok.
> > Combined all the switch statements used by Rssa and Machine into a
> > single datatype -- see backend/switch.sig. With that and the changes
> > to operands, Rssa and Machine are starting to look suspiciously
> > similar. Hopefully one day we will be able to unify them.
>
> What are the major differences between RSSA and Machine right now?
> Mostly the distinction between stack and backend registers?
RSSA uses variables while MACHINE uses registers and stack offsets.
Variables and registers are essentially the same thing, with the
difference being that variables can be live across nontail calls,
while registers can't. But I think we can push that difference into
the type checker, unify variables and registers and view register
allocation as a pass that enforces the invariant that variables are
not live across nontail calls.
Another difference is that RSSA groups blocks by function while
MACHINE groups blocks by chunk. The main difference is the RSSA
grouping makes the information about the returns and raises of a block
implicit in the function the block is in. That should be easy to fix
by attaching the raises and returns info to every block. Then we can
unify the notions of function and chunk.
There's a few other minor differences, like some operands that are in
one but not the other, but mostly I think what's left is pushing
through all the details.
> > The backend register allocation is no longer attempts to share a
> > register for multiple variables. This may cause performance problems
> > since the local{char,int,...} arrays used by the native codegen to
> > cache real registers will no longer be as small or as densely used.
>
> Have you run any benchmarks against something from before the merge.
The benchmarks were fine (see below), but oddly enough, despite that,
the self-compile performance was horrible. For example, here's what
I saw on my usual test machine.
Compile SML starting
pre codegen starting
pre codegen finished in 118.24 + 31.22 (21% GC)
x86 code gen starting
x86 code gen finished in 338.65 + 68.80 (17% GC)
Compile SML finished in 456.89 + 100.02 (18% GC)
I just put in some simple register sharing code, not as good as what
was there before, and was able to recover some of the performance.
It's amazing to me that the register sharing makes this much
difference.
Compile SML starting
pre codegen starting
pre codegen finished in 107.97 + 38.83 (26% GC)
x86 code gen starting
x86 code gen finished in 138.15 + 55.34 (29% GC)
Compile SML finished in 246.12 + 94.17 (28% GC)
I think that's worse than what was there before, so I'm going to
retrofit the old register sharing to the new RSSA/MACHINE. It
shouldn't be too bad.
Anyways, here's the benchmarks (with no register sharing at all). The
one problem with tensor was in MLton0, not MLton1.
MLton0 -- /usr/bin/mlton
MLton1 -- mlton
run time ratio
benchmark MLton1
barnes-hut 1.00
boyer 0.92
checksum 1.00
count-graphs 1.03
DLXSimulator 1.00
fft 1.00
fib 1.01
hamlet 0.93
imp-for 0.95
knuth-bendix 0.88
lexgen 1.05
life 1.05
logic 0.99
mandelbrot 1.00
matrix-multiply 1.00
md5 1.00
merge 0.98
mlyacc 1.00
model-elimination 1.01
mpuz 1.07
nucleic 0.93
peek 1.00
psdes-random 0.99
ratio-regions 0.99
ray 0.99
raytrace 1.03
simple 1.00
smith-normal-form 1.00
tailfib 1.00
tak 1.00
tensor ~1.00
tsp 1.03
tyan 1.00
vector-concat 1.05
vector-rev 1.00
vliw 0.96
wc-input1 1.00
wc-scanStream 0.99
zebra 1.10
zern 1.00
size
benchmark MLton0 MLton1
barnes-hut 104,080 113,328
boyer 141,303 135,991
checksum 44,551 46,927
count-graphs 66,399 64,815
DLXSimulator 102,208 103,248
fft 53,595 55,563
fib 44,591 46,991
hamlet 1,227,840 1,240,128
imp-for 44,511 46,927
knuth-bendix 87,136 87,728
lexgen 172,653 166,205
life 64,815 66,999
logic 104,631 106,919
mandelbrot 44,623 47,079
matrix-multiply 45,127 47,503
md5 53,720 55,816
merge 45,879 48,311
mlyacc 535,501 506,829
model-elimination 634,288 622,416
mpuz 50,519 51,943
nucleic 191,999 194,407
peek 52,760 53,776
psdes-random 45,727 47,671
ratio-regions 62,999 65,287
ray 104,224 109,232
raytrace 278,701 275,869
simple 200,587 201,691
smith-normal-form 181,924 187,748
tailfib 44,319 46,767
tak 44,751 47,135
tensor * 111,163
tsp 59,728 60,944
tyan 107,648 107,424
vector-concat 45,087 48,103
vector-rev 44,911 47,311
vliw 323,353 320,953
wc-input1 66,733 68,613
wc-scanStream 67,213 69,341
zebra 143,272 155,112
zern 51,330 52,866
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel