[MLton-user] Why is recursion slower than C?

Wed Apr 25 16:13:24 PDT 2007

* Wesley W. Terpstra <wesley at terpstra.ca> [070425 09:40]:
> On Apr 25, 2007, at 5:44 PM, Neal Glew wrote:
> >Intel hardware (probably AMD as well) has a return address stack as
> >part of the branch prediction logic.  This stack stores the most
> >recent n (for some n) addresses pushed by call instructions.  At a ret
> >instruction the top of this stack is used to predict where the return
> >will return to (and the stack is popped).  Not using these
> >instructions means not using this part of the branch prediction logic
> >and generally results in very poor branch prediction for returns, and
> >thus much stalling.  I suspect that this accounts for most of the
> >performance difference you are experiencing.
> 
> Yeah. It sounds like we're all in agreement about why C is faster  
> here. It's a shame because one would really like SML to be fast at  
> recursion. As long as
> >>Also, it seems MLton is using %ebp for the stack pointer, instead  
> >>of %esp. Why is this? I gather this is why call/return are out
> >In a MLton compiled program, there are effectively two stacks: the  
> >system C stack and the ML stack (allocated in the ML heap, which in  
> >turn is allocated in the system C heap).
> this remains true, it seems unlikely that MLton will ever be able to  
> compete with C on recursion with two non-tail branches.

I think that the reason for this is somewhat simpler. GCC is converting
the recursion into a loop and then unrolling that loop.