benchmarks
Matthew Fluet
fluet@research.nj.nec.com
Tue, 15 Aug 2000 17:13:55 -0400 (EDT)
> > I don't quite know what's up with smith-normal-form,
> > especially considering that the x86-codegen's executable is just as fast.
>
> All the time in smith-normal-form is in the gmp IntInf libraries. MLton
> optimization doesn't matter.
Yes, but that doesn't explain why gcc is taking so long on it (or why the
x86-codegen is taking so short).
> Cool. I'd vote for doubles pretty soon too. My current thinking is we shoot
> for a release by January. I think we're well on the way, but I'd like to live
> with the backend for as long as possible.
Doubles shouldn't be too difficult, although register allocation on them
will be even worse than for integers.
> > Finally, there are some other tweaks I'd like to try to avoid some memory
> > references that I'm seeing. I don't expect anything spectacular, but
> > every little bit might help.
>
> I've been thinking a little bit about all of the must-not alias information
> that's available on the CPS and Machine ILs. For example, we know that stack
> slots and heap object must not alias. We know that heap objects of different
> types must not alias. Would getting this information down to your backend help?
> I'm thinking at least of the kinds of things Henry mentioned a while back in the
> FFT code.
The short answer is that it wouldn't necessarily help because I'm assuming
that already. The longer answer is that it probably would help if I
do some minor changes to the register allocator so that it isn't assuming
too much. Right now, I'm (probably incorrectly) assuming that any two
memory locations which aren't structurally equal aren't "really" equal.
This is fine for lots of comparisions, but I guess could be incorrect in
some cases, e.g. OP(RP(1), 0) vs OP(RP(2), 0). I've never had a bug that
I could trace back to this, but that doesn't mean it's not there. And
the more I think about it, the scary it is, but I think it's limited to
arrays and offsets. Everything else should be mutually disjoint (globals,
pseudo-regs, stack slots) and will be disjoint to arrays and offsets.
The information could be used, although how easily depends on it's form.
There isn't an inverse map from x86 MemLocs to Machine IL Operands, so an
operand * operand -> bool function isn't immediately helpful. I'll think
about it some more as well.