benchmarks
Stephen Weeks
MLton@sourcelight.com
Tue, 15 Aug 2000 11:51:48 -0700 (PDT)
> Compile Times: user+sys
> benchmark c-codegen x86-codegen
> checksum 3.15 3.15
> count-graphs 9.60 7.34
> fib 2.81 2.88
> knuth-bendix 15.70 10.63
> life 7.30 5.75
> logic 47.36 30.60
> mlyacc 443.73 387.54
> mpuz 4.65 4.03
> ratio-regions 17.32 13.03
> smith-normal-form 221.73 94.04
> tak 2.88 2.90
> wc 7.50 6.03
>
> This seems to confirm that we can do better than gcc, particularly for
> large programs.
Absolutely. I think there's still a win speeding up your pass, which must
be taking a sizeable portion of compile time.
> I don't quite know what's up with smith-normal-form,
> especially considering that the x86-codegen's executable is just as fast.
All the time in smith-normal-form is in the gmp IntInf libraries. MLton
optimization doesn't matter.
> Running Times: user+sys
> benchmark c-codegen x86-codegen x86/c
> checksum 11.58 13.20 1.14
> count-graphs 18.87 21.20 1.12
> fib 21.26 17.28 0.81
> knuth-bendix 37.32 39.27 1.05
> life 103.25 116.52 1.13
> logic 91.48 77.36 0.79
> mlyacc 41.10 34.70 0.84
> mpuz 76.60 90.85 1.19
> ratio-regions 41.40 39.79 0.96
> smith-normal-form 4.04 4.02 1.00
> tak 48.46 37.86 0.78
> wc 24.72 36.34 1.47
>
> We're closing the gap here.
I'll say. The only really disappointing one is life, because c-codegen is
already (slightly) slower than SML/NJ.
> collapsing if-s whose branches are the same label (yes, this really does
> occur after eliminating jumps to jumps)
Tell me the program and I'll look into why CPS optimization isn't getting it.
> There are two other simplifications that I would like to try:
> (1) currently, before a transfer, all pseudo-regs are flushed.
all? You are only flushing live ones, right? :-)
> Adding (1) shouldn't be difficult at all. Adding (2) will be a little bit
> more difficult, probably not by the end of this week.
Cool. I'd vote for doubles pretty soon too. My current thinking is we shoot
for a release by January. I think we're well on the way, but I'd like to live
with the backend for as long as possible.
> Finally, there are some other tweaks I'd like to try to avoid some memory
> references that I'm seeing. I don't expect anything spectacular, but
> every little bit might help.
I've been thinking a little bit about all of the must-not alias information
that's available on the CPS and Machine ILs. For example, we know that stack
slots and heap object must not alias. We know that heap objects of different
types must not alias. Would getting this information down to your backend help?
I'm thinking at least of the kinds of things Henry mentioned a while back in the
FFT code.