I would guess that the slowness of the array-of-array choice is because of the generational GC. In general this makes mutation more expensive since you have to mark a card to indicate that the written to object may contain a pointer to a younger generation.