x86 performance
Matthew Fluet
fluet@research.nj.nec.com
Wed, 9 Aug 2000 10:13:08 -0400 (EDT)
> Note, converting to the C-style code (at least in this case) is trivial for a
> peep-hole optimizer.
I'm not sure that I entirely agree. Certainly for the decl statement,
it's not clear to me that the other two could be handled by peep-hole
optimizations (at least in the framework where I'm working.)
> The C compiler uses leal as cheap 3-address arithmetic while the x86
> version uses a move followed by an add constant. Note, the C
It's certainly easy to find moves followed by an add constant, but knowing
that they correspond to pointers isn't information that's available.
(Another argument for RCPS and TAL?) In this particular case, I can
actually get by with modifying the translation of the limitCheck to use a
leal instead of the movl/addl combo. But, are there other cases where I
want a leal? (I'm thinking in terms of the statements in the machine IL
that don't have a lot of supporting code. Since there aren't really any
pointer comparisions, maybe there aren't.)
> The C compiler kept a value in a register while the x86 had to store it
> into memory and then reload it. Also the x86 code used an absolute
I wouldn't call this a peep-hole optimization, but it's certainly
something I hope to be able to support. Part of this should fall out of
eliminating redundant jumps. Also, since we have the pseudo-regs live at
entry for each block, it's possible to pass those values between blocks in
real registers, rather than saving and restoring them. Unfortunately,
that's going to hurt in terms of register-register moves given the way the
register allocator is written; since I'll need to shuffle the pseudo-regs
from wherever they end up living to where I want them for the next block.
> You didn't say what kind of machine you were running on, but lets suppose it
It's a 200 MHz PPro.