[MLton-user] How to write performant network code
Wesley W. Terpstra
wesley at terpstra.ca
Wed Jan 14 13:24:34 PST 2009
On Mon, Jan 12, 2009 at 5:13 AM, Matthew Fluet <fluet at tti-c.org> wrote:
> Does memcpy (or memmove, since the *Array{,Slice}.copy functions needs to
> work with potentially overlapping regions) do anything more than a
> word-by-word copy?
Yes. memcpy is usually hand-crafted and extremely fast assembler. It
uses SSE and other tricks. Is it safe to also modify Word8Array.vector
to use memcpy? What about polymorphic Array.vector?
> If not, then it seems to me that, at least for the
> Word8Array{,Slice}.copy{,Vec} functions, you could stay in SML and use the
> PackWord<N>{Big,Little}.{subVec,subArr,update} functions?
Have you noticed that calling Word32.fromLarge o
PackWord32Little.subVec will generate this:
L_1085:
movl (c_stackP+0x0),%edi
xchgl %edi,%esp
subl $0xC,%esp
movl 0x4(%ebp),%ecx
pushl 0x0(%ecx,%esi,4)
movl %eax,%esi
movl %ebx,(localWord32+0x8)
movb %dl,%bl
call WordU32_extdToWord64
addl $0x10,%esp
L_1086:
subl $0x8,%esp
pushl %edx
pushl %eax
call WordU64_extdToWord32
addl $0x10,%esp
L_1087:
ie: You need to make two C calls to convert to and from a 64-bit
integer! Any code using PackWord is sloooow. It's probably faster to
Word8Vector.sub four times and or the result together.
>> The second bottleneck comes from hashing network addresses. We need to
>> identify who sent us a packet and we keep information for each peer in
>> a hash table. Unfortunately, it's practically impossible to hash the
>> address as obtained from Socket.recvArrFrom. The only method that
>> seems to be available to us is to first convert the address to a
>> string and then to hash that. Unbelievable as it may be, hashing the
>> address this way is one of the slowest parts to processing the packet!
>> The address type can't be passed through the FFI itself (to extract
>> the 32-bit IP+16-bit port) because it is wrapped inside a datatype. So
>> the only options that seem available to me at present are: 1)
>> completely replace all networking calls with direct FFI, by-passing
>> the basis or 2) add some extension in MLton.Socket that lets me get
>> the address out as a Word8VectorSlice.slice. Any better suggestions?
>
> Certainly, 2) is much better than 1).
How about I just add a MLton.Socket.Address.toVector which simply
exposes the underlying Word8Vector.vector in network byte order?
> A while ago, I added a primitive (structural) polymorphic hash:
> http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=6352
> It would seem to suit your purposes: you can use it to hash any value,
> including datatypes.
This is very nice and I didn't know about it. Unfortunately, it's not
enough because I need a universal hash function (one that takes a
'seed' with the value to hash). Also, one still needs to be able to
serialize a network address out to the network in some situations. (eg
to say: send reply message to X, not me)
More information about the MLton-user
mailing list