[MLton] Number theory and MLton C FFI
Wesley W. Terpstra
terpstra@gkec.tu-darmstadt.de
Fri, 22 Oct 2004 04:09:22 +0200
On Thu, Oct 21, 2004 at 05:03:32PM -0700, Stephen Weeks wrote:
> For fun, I took a look at what MLton generated for your add routine.
>
> movl (xxx+(0*4)),%esp
> addl (yyy+(0*4)),%esp
> movl %esp,%esi
> andl $0x7FFFFFFF,%esi
> shrl $0x1F,%esp
> addl %esi,%esp
Not bad at all!
> call WordU32_toWord64
> call WordU32_toWord64
> call WordU64_rshift
> call WordU64_toWord32
> call WordU64_toWord32
> shrl $0x1,%eax
> addl %ebx,%eax
> movl %eax,%esi
> andl $0x7FFFFFFF,%esi
> shrl $0x1F,%eax
> addl %esi,%eax
... where's the multiply?
I agree the last part is not bad.
> For the record, here's the SML code I used.
Thank you; would you mind if I copied it?
(Your implementation is nicer than mine)
> > Real hardware does this in one instruction.
> > Promoting them to Word64s and then multiplying does not do that.
>
> MLton does not do that right now, since it will do a function call for
> each Word32->Word64 conversion and for the Word64 mul. But, as Henry
> says, it seems quite plausible that MLton will do the right thing some
> day. This is a straightforward local optimization. In any case, my
> main point is that this is not a limitation of SML, just of MLton.
If MLton knew this optimization, it might mean I wouldn't need C.
It wouldn't be too hard to target SML with my code generator.
One of the big advantages of using SML is that I could also make the code
use roots of unity that are computed automatically based on the field
construction. That would be really great as then they code could be reused!
C++ can compute these things at compile time (using templates), but C not.
Actually, how far will the MLton compiler go?
fun first_factor x =
let
open Word64
fun helper i =
if (x mod i) = 0w0 then i else
helper (i+0w1)
in
helper 0w2
end ;
val x = first_factor 0w4611686014132420609 ;
print (Word64.toString x ^ "\n");;
Shouldn't x be computed at compile-time? (a bit extreme perhaps ;)
I see that for my MLton compiler it is not.
> I wonder if your C routines are expensive enough that marshalling is a
> viable way to communicate between SML and C.
Possibly.
Another thing that matters for fast number theory functions is: SIMD.
I imagine automatic use of SIMD in MLton is a long ways off, but a good
programmer can make intrinsics go a long way.
Maybe I should try a source dive into MLton.
If there was an SSE2 : SIMD, MMX : SIMD, AltiVec : SIMD and that 64bit
mul optimization, I could perhaps achieve everything I need in SML.
For now C is the most expedient route, for one since it already works...
Also, so far as I can tell, I can't write a library under MLton and expect
people to be able to use it from C without forcing MLton on them. Are there
any plans for MLton's whole-program optimization to use a different entry
point than main? Being able to list a few method to export as C functions
would be enough to build some useful libraries under MLton.
--
Wesley W. Terpstra