If converting 2 Word32.word's to Word64.word's and then multiplying them does not simply use a single instruction 32*32->64 instruction, then that is `just' an optimization that MLton is missing and should be included.