[MLton-user] Why is recursion slower than C?
    Neal Glew 
    neal at glew.name
       
    Wed Apr 25 08:44:40 PDT 2007
    
    
  
Intel hardware (probably AMD as well) has a return address stack as
part of the branch prediction logic.  This stack stores the most
recent n (for some n) addresses pushed by call instructions.  At a ret
instruction the top of this stack is used to predict where the return
will return to (and the stack is popped).  Not using these
instructions means not using this part of the branch prediction logic
and generally results in very poor branch prediction for returns, and
thus much stalling.  I suspect that this accounts for most of the
performance difference you are experiencing.  It is possible that
stack limit checks and overflow checks are part of it, but the code
looks pretty tight to me.
Neal
    
    
More information about the MLton-user
mailing list