[MLton] latest benchmarks

Matthew Fluet fluet at tti-c.org
Thu Jun 21 07:01:01 PDT 2007


skaller wrote:
> On Wed, 2007-06-20 at 11:15 -0500, Matthew Fluet wrote:
> 
>> I suspect that the later behavior is due to the fact that on x86_64, 
>> sequences (arrays/vectors) are indexed by 64-bit integers in the 
>> primitive operations (sub, update, etc), but indexed by 32-bit integers 
>> in the user code (Array.sub, Array.update, etc. since Int.int 
>> corresponds to Int32.int).  Hence, there are quite a few 64/32 
>> conversions going on.
> 
> I doubt that is quite correct. 32 to 64 bit 'in register' extension
> is going to be invisibly fast.

But, this is SML -- we need to handle overflow checking on conversions, 
etc.  As you say, though, a 32 to 64 bit extensions doesn't need 
additional checks, so Array.sub and Array.update are actually pretty 
fast.  But, the implementation of something like
   Array.appi : (int * 'a -> unit) -> 'a array -> unit
is internally using a 64bit integer for the index variable, which it 
repeatedly casts down to the 32-bit integer to deliver to the user 
function.  That cast needs to check for overflow (though, I believe that 
  one can't construct an array with more than 2^31 elements (since 
Array.array : int * 'a -> 'a array takes a 32-bit length), so the cast 
can't overflow).

> The real problem is likely to be the opposite! A 32 bit memory
> read is more expensive than a 64 bit read because the processor
> actually reads 64 bits .. not a big deal. But for a 32 bit
> write is is a big deal: the processor has to read 64 bits,
> modify 32 of them, and store the resulting 64 bits back.
> 
> For this to work properly it has to be made atomic, which effectively
> destroys cache write buffer transparency. 
> 
> Something like that anyhow .. :) the point is, the address and data
> bus on an amd64 are 64 bits so random 32 bit operations actually 
> cost MORE. 32 bits might be faster for sequential operations.

But, wouldn't all of this imply that the 32-bit executables run on an 
amd64 would have even worse performance, since they never use 64-bit data?






More information about the MLton mailing list