re new SML/NJ

Matthew Fluet
Mon, 13 Aug 2001 12:03:37 -0700 (PDT)

> Speaking  of  which,  I  tried  the silly floating point test that appeard in
> comp.lang.functional
>     fun test (n, x) =
>            if n = 0
>               then x
>               else test (n - 1, x + Math.cos x)
> and our code was definitely nicely faster than gcc:
>     gcc     278.9 nanoseconds
>     MLton   234.2 nanoseconds
> so we are 20% faster.  I looked at the code, and ours still does one floating
> point  load  and  one store per loop, but the C version is doming some really
> funny stuff.
> In the best of all worlds, the back end would figure out  that  the  floating
> point register only has to be stored when the loop finishes.

I finally checked this out under the new codegen (which is carrying
floating-point values across blocks in registers).  Here's the hot loop:

	cmpl (gcState+8),%esi
	jbe skipGC_7
	testl %esp,%esp
	jz L_241
	decl %esp
	jo L_242
	fld %st
	faddp %st, %st(1)
	jmp statementLimitCheckLoop_7

Pretty good, I think.  No memory traffic -- except for the GC check, which
really seems like it should be delayed until the loop exits.