x86 performance
Stephen Weeks
MLton@sourcelight.com
Wed, 9 Aug 2000 10:02:42 -0700 (PDT)
> > The C compiler uses leal as cheap 3-address arithmetic while the x86
> > version uses a move followed by an add constant. Note, the C
>
> It's certainly easy to find moves followed by an add constant, but knowing
> that they correspond to pointers isn't information that's available.
Why does it matter whether or not it's a pointer? Can't you just always use
leal for adds of small constants? Maybe you could even do this directly in
translate, without needing peephole.
> I wouldn't call this a peep-hole optimization, but it's certainly
> something I hope to be able to support. Part of this should fall out of
> eliminating redundant jumps. Also, since we have the pseudo-regs live at
> entry for each block, it's possible to pass those values between blocks in
> real registers, rather than saving and restoring them. Unfortunately,
> that's going to hurt in terms of register-register moves given the way the
> register allocator is written; since I'll need to shuffle the pseudo-regs
> from wherever they end up living to where I want them for the next block.
Yeah. The right thing to do is to process the basic blocks in some kind of
dfs postorder so that when you process a block you have already processed its
targets (most of the time). Then you can know where you want stuff and try to
put it there. Appel talks about this a little in Section 13.7 of Compiling with
Continuations. Section 9.7 of the dragon book talks about this as well.