Team PLClub ICFP entry -- comparing the performance of OCAML and SML
Allen Leung
leunga@cs.nyu.edu
Fri, 13 Oct 2000 19:04:38 -0400 (EDT)
> I don't believe the unexplained part has anything to do with whole
> program analysis, register allocation, or peephole optimization.
> The new SML/NJ backend does a pretty good job of all these and a
> more sophisticated optimizer can seldom yield more than 10% speedup.
>
> -Zhong Shao
> (shao-zhong@cs.yale.edu)
>
Hi,
Actually, the SML/NJ backend currently uses the ``wrong'' framework for
FP register allocation on the x86. Instead of using the FP stack registers
as registers, it uses them only as temporaries for evaluation expressions.
Virtual registers are actually placed on the (memory) stack.
So there is a huge penalty with FP intensive loops, compared to using the
``right'' framework. How many of these benchmarks are FP intensive? The
performance of SML/NJ may have something to do with the RA.
cheers,
Allen Leung
(leunga@cs.nyu.edu)