I'm curious that the `write barrior' in OCaml GC is SO expensive (Leroy says that 70% of the time in the simple example is in `cross-generation checks'.) The sum loop isn't doing any writes at all. Do you know OCaml uses a Chez- style byte vector of possible cross-space pointers, or does it use a MUCH more expensive store list. The latter never made any real sense to me since every store cons's onto the list.