[MLton] Re: [MLton-user] How to write performant network code
Matthew Fluet
fluet at tti-c.org
Thu Jan 15 20:43:12 PST 2009
(moved from mlton-user)
On Wed, 14 Jan 2009, Wesley W. Terpstra wrote:
> On Mon, Jan 12, 2009 at 5:13 AM, Matthew Fluet <fluet at tti-c.org> wrote:
>> Does memcpy (or memmove, since the *Array{,Slice}.copy functions needs to
>> work with potentially overlapping regions) do anything more than a
>> word-by-word copy?
>
> Yes. memcpy is usually hand-crafted and extremely fast assembler. It
> uses SSE and other tricks. Is it safe to also modify Word8Array.vector
> to use memcpy?
You would want to modify the implementation of Word8Array.vector to create
an uninitialized array, memcpy into the new array, and then cast from
array to vector. So, yes, that would be safe.
> What about polymorphic Array.vector?
That gets a bit trickier. You want to be careful about using memcpy on
polymorphic arrays. The issue is that it constrains the to and from
arrays to be (permanently) of the same type. For example, if you have two
"(int * bool) array"s and copy from the first into the second, but the
second never uses the bool component, then under the element-by-element
copy, MLton could drop the bool component of the second array and
compensate during the element-by-element copy by only writing the int
component. But, if you require a memcpy, then the src and dst need to be
of exactly the same type.
This applies as well to the Word8Array case, but it seems less likely that
you copy from a Word8Array.array to a Word8Array.array and never use the
destination Word8Array.array. On the other hand, with a polymorphic array
instantiated with an abstract type, there seems to be a lot more
opportunities for pruning unused components. So, I would limit it to
Word<N>Array{,Slice} for now.
Another difficulty with polymorphic arrays is that it isn't until late in
the compile time that you know the size of the array elements. The
memcpy needs that information to know how much to copy.
BTW, since we don't support interior pointers, the copy needs to have
types like:
Word8Array_copy : (Word8.t array (* src *)
* SeqIndex.t (* src offset *)
* Word8.t array (* dst *)
* SeqIndex.t (* dst offset *)
* SeqIndex.t (* count *)) -> unit
Word8Vector_copy : (Word8.t vector (* src *)
* SeqIndex.t (* src offset *)
* Word8.t array (* dst *)
* SeqIndex.t (* dst offset *)
* SeqIndex.t (* count *)) -> unit
It might be worth adding these as primitives, though it isn't clear that
we can optimize much with regards to them. If, for instance, the
destination array is never read from, then we could drop the copy. But,
that seems unlikely to arise in realistic code.
>> A while ago, I added a primitive (structural) polymorphic hash:
>> http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=6352
>> It would seem to suit your purposes: you can use it to hash any value,
>> including datatypes.
>
> This is very nice and I didn't know about it. Unfortunately, it's not
> enough because I need a universal hash function (one that takes a
> 'seed' with the value to hash).
For a given program, MLton.hash is a function (that is, it always returns
the same hash value for structurally equivalent inputs). So, why can't
you take the result of MLton.hash and munge it with your 'seed'? Or,
better, you can always use (fn x => MLton.hash (seed, x)), so that you
hash you seed along with the structure of interest.
> Also, one still needs to be able to
> serialize a network address out to the network in some situations. (eg
> to say: send reply message to X, not me)
Fair enough. Though, in that situation, isn't it better to go through the
Basis Library functions? Blast writing a struct sockaddr to the network
might not be blast read by another arch/os unless you guarantee that the
sizes, alignment, padding are all the same.
More information about the MLton
mailing list