As I'm sure everyone has run into at some time or another, the PackWordX API is flawed:<br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
<code><b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.bytesPerElem:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.bytesPerElem:VAL" target="_blank">bytesPerElem</a> <b>:</b> int</code><br><code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.isBigEndian:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.isBigEndian:VAL" target="_blank">isBigEndian</a> <b>:</b> bool</code><br><code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.subVec:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.subVec:VAL" target="_blank">subVec</a> <b>:</b> Word8Vector.vector <b>*</b> int <b>-></b> LargeWord.word</code><br>
<code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.subVecX:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.subVecX:VAL" target="_blank">subVecX</a> <b>:</b> Word8Vector.vector <b>*</b> int <b>-></b> LargeWord.word</code><br>
<code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.subArr:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.subArr:VAL" target="_blank">subArr</a> <b>:</b> Word8Array.array <b>*</b> int <b>-></b> LargeWord.word</code><br>
<code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.subArrX:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.subArrX:VAL" target="_blank">subArrX</a> <b>:</b> Word8Array.array <b>*</b> int <b>-></b> LargeWord.word</code><br>
<code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.update:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.update:VAL" target="_blank">update</a> <b>:</b> Word8Array.array <b>*</b> int <b>*</b> LargeWord.word</code><br>
<code>
<b>-></b> unit</code><br><code></code></blockquote><code><br></code>where instead it should read something like:<code><br><br></code><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
<code><b>type</b> <a name="12254a6764f055fe_SIG:PACK_WORD.bytesPerElem:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.bytesPerElem:VAL" target="_blank">word</a></code><br><code><b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.bytesPerElem:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.bytesPerElem:VAL" target="_blank">bytesPerElem</a> <b>:</b> int</code><br>
<code>
</code><code><b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.isBigEndian:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.isBigEndian:VAL" target="_blank">isBigEndian</a> <b>:</b> bool</code><br><code>
<b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.subVec:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.subVec:VAL" target="_blank">subVec</a> <b>:</b> Word8Vector.vector <b>*</b> int <b>-></b> word</code><br>
<code>
<b></b></code><code><b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.subArr:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.subArr:VAL" target="_blank">subArr</a> <b>:</b> Word8Array.array <b>*</b> int <b>-></b> word</code><br>
<code>
<b></b></code><code><b>val</b> <a name="12254a6764f055fe_SIG:PACK_WORD.update:VAL:SPEC" href="http://mlton.org/basis/pack-word.html#SIG:PACK_WORD.update:VAL" target="_blank">update</a> <b>:</b> Word8Array.array <b>*</b> int <b>*</b> word</code><code> <b>-></b> unit</code></blockquote>
<div><br>In our networking code, I worked around this by using _prim "Word8Array_subWordX" if MLton is used. This avoids the two C calls casting in and out of a 64-bit word for every word written into the data stream. I recently ran into trouble on a 64-bit machine because SeqIndex.int is not int, and I got a PrimApp error. As a stop-gap measure, I'm open to suggestions of an Int/Word type that must match SeqIndex.<br>
<br>It would be nice to have 'unsafe' versions without the LargeWord baggage available somewhere, so _prim isn't needed. Armed with 'unsafe' PackWord, it would be easy to implement faster string/Word8Array copies, as discussed beforre.<br>
<br>I'll also note that PackWord represents yet another case where the basis library expects MLton to optimize fromLarge o toLarge to nothing. I've been getting increasingly annoyed by the costs I pay to convert between types. I really liked Vesa's suggestion of {to/from}Fixed for the INTEGER signature. Combining that with the optimization to turn<br>
x_1227: word32 = Word8Vector_subWord32 (x_1072, x_1074)<br>
x_1226: word64 = WordU32_extdToWord64 (x_1227)<br>
x_1225: word32 = WordU64_extdToWord32 (x_1226)<br>into<br> x_1225:Word = x_1227<br>I think we would be able to achieve 0-cost conversions in almost all the cases where it is safe.<br><br>If that conversion optimization were placed before commonArg and knownCase I think Int8.fromFixed o Int8.toFixed would even become a no-op with overflow checking:<br>
<br>x_1 = ...<br>x_2 = WordU8_sextdToWord64 x_1<br>x_3 = WordU64_sextdToWord8 x_2<br>(* from iwconv0 bounds checking: *)<br>x_4 = WordU8_sextdToWord64 x_3<br>x_5 = Word64_eq (x_2, x_4)<br>raise Overflow exception if x_5 is false<br>
<br>First, comes the new optimization:<br>x_3 = x_1<br>Then comes commonArg/commSubexp<br>x_4 and x_3 are replaced by x_2 and x_1 respectively<br>Then comes knownCase:<br>Word64_eq (x_2, x_2) is never false -> exception never raised<br>
<br>Am I correct in this assessment? If so, that's a pretty serious speed-up: 5 C calls and a potential branch turned into a no-op. Compared to 4 conversion in/out of an IntInf, things look even better!<br><br></div>