slow matrix multiply
Stephen Weeks
MLton@sourcelight.com
Tue, 10 Jul 2001 18:22:47 -0700
I rewrote matrix.mlton to be more like the ocaml version, using 'a array array
to represent the 2D array and manually hoisting the constant array subscript.
This sped stuff up so that the mlton time is now 2.3 (versus the old 4.8), which
is close to ocaml's 1.4.
Following is the source and annotated assembly.
Now, MLton's code is pretty close to Ocaml's, except for stack slots not being
kept in registers.
fun loop (k, sum) =
if k < 0
then sum
else loop (k - 1, sum + m1i k * sub (m2, k, j))
loop_54:
movl (204*1)(%edi),%esp # %esp = k
cmpl $0,%esp # if k < 0
jl L_235
movl %esp,%ebp # %ebp = k
decl %ebp # %ebp = k - 1
movl (196*1)(%edi),%edx # %edx = m1i
movl %esp,%ecx # %ecx = k
movl (%edx,%ecx,4),%esp # %esp = m1i k
movl (160*1)(%edi),%edx # %edx = m2
movl (%edx,%ecx,4),%ebx # %ebx = sub (m2, k)
movl %ebp,(204*1)(%edi) # store k
movl %esp,%eax # %eax = m1i k
movl (192*1)(%edi),%ebp # %ebp = j
cltd
imull (%ebx,%ebp,4) # %eax = m1i k * sub (m2, k, j)
addl %eax,(200*1)(%edi) # store sum + ...
jmp loop_54