From wesley at terpstra.ca Wed Jan 7 07:49:10 2009 From: wesley at terpstra.ca (Wesley W. Terpstra) Date: Wed Jan 7 07:49:45 2009 Subject: [MLton-user] How to write performant network code Message-ID: <162de7480901070749y1cbe1411gea793f7ee66300a6@mail.gmail.com> I have been working on a network oriented program in SML. Currently we have two main bottlenecks in our code (according to mlprof and confirmed via replacing the code with stubs). The first problem has to do with copying data when assembling a packet. We've already eliminated as many copies as possible through careful use of slices, and only one copy happens when finally construction the (contiguous) packet which is to be sent. This copy is one of the slowest points in our code, using Word8ArraySlice.copy. The problem is that Word8ArraySlice.copy does a byte-by-byte copy with a not very optimized loop. It seems to me that all the WordXArray[Slice].copy[Vec] functions could just call out to memcpy. This should be a lot faster. Is there any objection to my preparing a patch to apply this optimization in the basis? The second bottleneck comes from hashing network addresses. We need to identify who sent us a packet and we keep information for each peer in a hash table. Unfortunately, it's practically impossible to hash the address as obtained from Socket.recvArrFrom. The only method that seems to be available to us is to first convert the address to a string and then to hash that. Unbelievable as it may be, hashing the address this way is one of the slowest parts to processing the packet! The address type can't be passed through the FFI itself (to extract the 32-bit IP+16-bit port) because it is wrapped inside a datatype. So the only options that seem available to me at present are: 1) completely replace all networking calls with direct FFI, by-passing the basis or 2) add some extension in MLton.Socket that lets me get the address out as a Word8VectorSlice.slice. Any better suggestions? From fluet at tti-c.org Sun Jan 11 20:13:33 2009 From: fluet at tti-c.org (Matthew Fluet) Date: Sun Jan 11 20:16:59 2009 Subject: [MLton-user] How to write performant network code In-Reply-To: <162de7480901070749y1cbe1411gea793f7ee66300a6@mail.gmail.com> References: <162de7480901070749y1cbe1411gea793f7ee66300a6@mail.gmail.com> Message-ID: On Wed, 7 Jan 2009, Wesley W. Terpstra wrote: > I have been working on a network oriented program in SML. Currently we > have two main bottlenecks in our code (according to mlprof and > confirmed via replacing the code with stubs). > > The first problem has to do with copying data when assembling a > packet. We've already eliminated as many copies as possible through > careful use of slices, and only one copy happens when finally > construction the (contiguous) packet which is to be sent. This copy is > one of the slowest points in our code, using Word8ArraySlice.copy. The > problem is that Word8ArraySlice.copy does a byte-by-byte copy with a > not very optimized loop. It seems to me that all the > WordXArray[Slice].copy[Vec] functions could just call out to memcpy. > This should be a lot faster. Is there any objection to my preparing a > patch to apply this optimization in the basis? Does memcpy (or memmove, since the *Array{,Slice}.copy functions needs to work with potentially overlapping regions) do anything more than a word-by-word copy? If not, then it seems to me that, at least for the Word8Array{,Slice}.copy{,Vec} functions, you could stay in SML and use the PackWord{Big,Little}.{subVec,subArr,update} functions? Or, it is probably better to directly use the corresponding primitives, in order to amortize the bounds checking. > The second bottleneck comes from hashing network addresses. We need to > identify who sent us a packet and we keep information for each peer in > a hash table. Unfortunately, it's practically impossible to hash the > address as obtained from Socket.recvArrFrom. The only method that > seems to be available to us is to first convert the address to a > string and then to hash that. Unbelievable as it may be, hashing the > address this way is one of the slowest parts to processing the packet! > The address type can't be passed through the FFI itself (to extract > the 32-bit IP+16-bit port) because it is wrapped inside a datatype. So > the only options that seem available to me at present are: 1) > completely replace all networking calls with direct FFI, by-passing > the basis or 2) add some extension in MLton.Socket that lets me get > the address out as a Word8VectorSlice.slice. Any better suggestions? Certainly, 2) is much better than 1). A while ago, I added a primitive (structural) polymorphic hash: http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=6352 It would seem to suit your purposes: you can use it to hash any value, including datatypes. From wesley at terpstra.ca Wed Jan 14 13:24:34 2009 From: wesley at terpstra.ca (Wesley W. Terpstra) Date: Wed Jan 14 13:25:08 2009 Subject: [MLton-user] How to write performant network code In-Reply-To: References: <162de7480901070749y1cbe1411gea793f7ee66300a6@mail.gmail.com> Message-ID: <162de7480901141324x722066f7v16e6accc153cf352@mail.gmail.com> On Mon, Jan 12, 2009 at 5:13 AM, Matthew Fluet wrote: > Does memcpy (or memmove, since the *Array{,Slice}.copy functions needs to > work with potentially overlapping regions) do anything more than a > word-by-word copy? Yes. memcpy is usually hand-crafted and extremely fast assembler. It uses SSE and other tricks. Is it safe to also modify Word8Array.vector to use memcpy? What about polymorphic Array.vector? > If not, then it seems to me that, at least for the > Word8Array{,Slice}.copy{,Vec} functions, you could stay in SML and use the > PackWord{Big,Little}.{subVec,subArr,update} functions? Have you noticed that calling Word32.fromLarge o PackWord32Little.subVec will generate this: L_1085: movl (c_stackP+0x0),%edi xchgl %edi,%esp subl $0xC,%esp movl 0x4(%ebp),%ecx pushl 0x0(%ecx,%esi,4) movl %eax,%esi movl %ebx,(localWord32+0x8) movb %dl,%bl call WordU32_extdToWord64 addl $0x10,%esp L_1086: subl $0x8,%esp pushl %edx pushl %eax call WordU64_extdToWord32 addl $0x10,%esp L_1087: ie: You need to make two C calls to convert to and from a 64-bit integer! Any code using PackWord is sloooow. It's probably faster to Word8Vector.sub four times and or the result together. >> The second bottleneck comes from hashing network addresses. We need to >> identify who sent us a packet and we keep information for each peer in >> a hash table. Unfortunately, it's practically impossible to hash the >> address as obtained from Socket.recvArrFrom. The only method that >> seems to be available to us is to first convert the address to a >> string and then to hash that. Unbelievable as it may be, hashing the >> address this way is one of the slowest parts to processing the packet! >> The address type can't be passed through the FFI itself (to extract >> the 32-bit IP+16-bit port) because it is wrapped inside a datatype. So >> the only options that seem available to me at present are: 1) >> completely replace all networking calls with direct FFI, by-passing >> the basis or 2) add some extension in MLton.Socket that lets me get >> the address out as a Word8VectorSlice.slice. Any better suggestions? > > Certainly, 2) is much better than 1). How about I just add a MLton.Socket.Address.toVector which simply exposes the underlying Word8Vector.vector in network byte order? > A while ago, I added a primitive (structural) polymorphic hash: > http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=6352 > It would seem to suit your purposes: you can use it to hash any value, > including datatypes. This is very nice and I didn't know about it. Unfortunately, it's not enough because I need a universal hash function (one that takes a 'seed' with the value to hash). Also, one still needs to be able to serialize a network address out to the network in some situations. (eg to say: send reply message to X, not me) From seanmcl at gmail.com Tue Jan 27 08:46:56 2009 From: seanmcl at gmail.com (Sean McLaughlin) Date: Tue Jan 27 08:47:31 2009 Subject: [MLton-user] comments in path map files? Message-ID: <6579f8680901270846j360ed8c2y3db683533d1a8ec8@mail.gmail.com> Hi, Would it be possible to add some kind of comment syntax to path map files for use with the -mlb-path-map option? Thanks, Sean