In that case I would guess that things would work more efficiently if the gcState wasn't passed around. On the Intel CPU every argument passed costs you a push at the call, and in the mojority of cases also a pop at the call after the return.