[MLton] Performance of Real.toInt
Matthew Fluet
fluet at tti-c.org
Mon Oct 27 08:48:02 PST 2008
On Sun, 26 Oct 2008, Vesa Karvonen wrote:
> On Fri, Oct 24, 2008 at 10:50 PM, Ryan Newton <rrnewton at gmail.com> wrote:
>> Under MLton I generate code like this:
>>
>> (Real64.toInt IEEEReal.TO_ZERO (var_tmpsmp_77))
>>
>> But it performs very poorly. I haven't researched this, but if I had
>> to guess, I'd bet this is because mlton is implementing some more
>> semantically meaningful notion than C casts.
>
> An excellent guess!
>
>> Nevertheless, is there
>> any inexpensive way to ape the behavior one gets from (int)x in C?
>
> Have you peeked into the real/real.sml source file in MLton's basis
> library implementation? The implementation of Real.toInt uses a
> family of toInt<N>Unsafe functions, that do not set the rounding mode
> or check that the floating point number is in the range of the integer
> type. One could perhaps extend the MLton.Real structure
> (http://mlton.org/MLtonReal) to expose those functions. You could
> then implement the conversion in terms of the unsafe functions.
As Vesa noted, SML's Real.toInt function does a lot more range checking
than C's (int)d cast. In SML, there are at least two floating-point
comparisons (performing the range check), a rounding mode set, a
floating-point round, a rounding mode (re)set, and a floating-point to int
coercion (the toInt<N>Unsafe).
If you are using the C codegen, then toInt<N>Unsafe is implemented by a C
cast; the semantics of a C cast is to convert with truncation (TO_ZERO)
semantics. If you are using the x86 codegen, then toInt<N>Unsafe is
implemented by the 'fist' instruction; the semantics of the 'fist'
instruction is to convert with the current rounding mode. If you are
using the amd64 codegen, then toInt<N>Unsafe is implemented by the
'cvt{s,d}2si{l,q}' instruction; the semantics of the 'cvt{s,d}2si{l,q}'
instruction is to convert with truncation (TO_ZERO) semantics. Since the
implmentations of toInt<N>Unsafe do not always obey the current rounding
mode, the SML implementation first does a floating-point round (under an
appropriate rounding mode); thus, all of the toInt<N>Unsafe
implementations behave the same. But, it also means that the
toInt<N>Unsafe primitives are only well defined when the floating-point
value is an integer; on non-integeral floating-point values, the different
codegens could return different results.
Note: on x86 with the C-codegen, the C cast actually generates another
set/reset of the rounding mode, because gcc wants to use the 'fist'
instruction, but with truncation (TO_ZERO) semantics (rather than the
current rounding mode). This may also be the case on other architectures.
If you are exclusively using the C-codegen, the exposing the
toInt<N>Unsafe functions in the MLton.Real structure would have the
behavior of a C-cast. (It will still be a little slower, because the cast
will occur in a non-inlined function; we don't inline some of the
floating-point operations, because gcc will constant fold without obeying
possible changes in the rounding mode. Though, given the explaination
above, since C's cast always ignores the current rounding mode and uses
truncation semantics, then it may be acceptable to inline.)
If you wanted something a little more well-defined, you could expose in
MLton.Real the composition of Primitive.Real<N>.round with
Primitive.Real<N>.toInt<M>Unsafe. That would first do a floating-point
round to integer (under the current rounding mode), followed by a coercion
to int (which, because the input will be an integral floating-point, will
be well-defined for all implementations). However, this would be slightly
different from a C-cast, since the default floating-point rounding mode is
TO_NEAREST (at least on x86 and amd64, and possibly specified by C99
and/or IEEE754), not TO_ZERO.
So, lots of choices, but nothing jumps out as a clear winner.
More information about the MLton
mailing list