Hmm, it failed the -codegen native as well as -codegen amd64. I&#39;ll investigate further.<br><br><div><span class="gmail_quote">On 6/21/07, <b class="gmail_sendername">Jesper Louis Andersen</b> &lt;<a href="mailto:jesper.louis.andersen@gmail.com">

jesper.louis.andersen@gmail.com</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I have a 32bit FreeBSD compile ready. Benchmarks to trickle in tomorrow when I

<br>get it to crunch while at work (It takes a fair amount of time for it to finish and I don&#39;t<br>want to mess too much with the laptop while it processes the benchmarks).

<br><br>There are also a few tweaks needed to run on 64-bit FreeBSD. I hope to be able to<br>look into them around the 1st of July.<div><span class="e" id="q_1134b55bcc807728_1"><br><br><div><span class="gmail_quote">On 6/20/07, 

<b class="gmail_sendername">Matthew Fluet

</b> &lt;<a href="mailto:fluet@tti-c.org" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">fluet@tti-c.org</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I&#39;ve merged the x86_64 branch into trunk.&nbsp;&nbsp;Since the previous

<br>announcement of the experimental release, there were only two minor bugs<br>reported:<br>&nbsp;&nbsp;1) Bug with -align 8 on x86_64<br>&nbsp;&nbsp;2) Inconsistent behavior with -const &#39;MLton.detectOverflow false&#39;<br>These have both been fixed, and I&#39;m pretty happy with the state of the

<br>x86_64 port.<br><br>I ran the benchmark suite to compare the last public release to the<br>current trunk.&nbsp;&nbsp;It is a bit of an apples-to-oranges comparison, since I<br>ran the benchmarks on an AMD Opteron (64-bit) system.&nbsp;&nbsp;So, the 20051205

<br>compiler (and its resulting executables) are running in 32-bit mode,<br>while the trunk compiler (and its resulting executables) are running in<br>64-bit mode.<br><br>[BTW, it would be nice if someone could run a corresponding benchmark

<br>suite on a 32-bit system, for a more apples-to-apples comparison.]<br><br>You can see all of the results at:<br><a href="http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mlton/trunk/doc/x86_64-port-notes/bench-20070619.txt?rev=5659" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">

http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mlton/trunk/doc/x86_64-port-notes/bench-20070619.txt?rev=5659</a><br><br>Some of the highlights:<br><br>* Benchmarks were run on a uni-core, dual-processor AMD Opteron 2.0GHz

,<br>8GB Memory, Fedora Core 6 machine (with gcc version 4.1.1 and linux<br>version 2.6.20 (x86_64)).<br><br>* compile time and code size is up across the board on trunk vs<br>20051205.&nbsp;&nbsp;I suspect that part of the code size increase can be

<br>attributed to the comparison of 32-bit executables to 64-bit<br>executables.&nbsp;&nbsp;Any 64-bit operation requires an additional 8bit<br>instruction prefix (as do 32-bit ops that touch the extended register<br>set).&nbsp;&nbsp;Compile time is probably partly explained by the bigger Basis

<br>Library implementation (increasing elaboration time and carrying more<br>code through early optimizations), and partly by the fact that the trunk<br>compiler is executing a little slower than the 20051205 compiler.<br>

<br>* recent versions of gcc are doing fairly well with the C code.&nbsp;&nbsp;(Note<br>that using -codegen c with 20051205 uses the version of gcc on the host<br>machine.)&nbsp;&nbsp;Indeed, the flat-array.sml benchmark needs to be revised, as

<br>gcc recognizes that the inner loop is pure (Overflow exceptions are<br>handled within the loop) and unused.&nbsp;&nbsp;The SSA{,2} optimizer should also<br>discover that the loop may be optimized, but that is another issue.<br>

GCC also does fairly well on the checksum benchmark with 20051205,<br>though it does horribly on the checksum benchmark with trunk.<br>I suspect that the later behavior is due to the fact that on x86_64,<br>sequences (arrays/vectors) are indexed by 64-bit integers in the

<br>primitive operations (sub, update, etc), but indexed by 32-bit integers<br>in the user code (Array.sub, Array.update, etc. since <a href="http://Int.int" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">

Int.int</a><br>corresponds to <a href="http://Int32.int" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">Int32.int

</a>).&nbsp;&nbsp;Hence, there are quite a few 64/32<br>conversions going on.<br><br>* I note that with both native codegens and C codegens, with both<br>20051205 and trunk, that -align 8 often has a positive impact on<br>runtime, and rarely has a significant negative impact.&nbsp;&nbsp;This might be

<br>due to the Opteron memory system.&nbsp;&nbsp;Aligned reads probably help most on<br>Real64 intensive benchmarks.<br><br>* The amd64 codegen is doing alright as compared to the x86 codegen.&nbsp;&nbsp;I<br>see at most a factor of 2 slowdown, and a few speedups.&nbsp;&nbsp;Again, I&#39;m not

<br>sure what real conclusions can be drawn.&nbsp;&nbsp;Some slowdowns are going to be<br>due to the changes to the runtime and Basis Library since 20051205; to<br>isolate those, I need a comparison of 20051205 to trunk on a 32-bit

<br>system.&nbsp;&nbsp;Some slowdowns are probably going to be due to the sequence<br>indexing discussed above.<br><br><br><br>_______________________________________________<br>MLton mailing list<br><a href="mailto:MLton@mlton.org" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">

MLton@mlton.org</a><br><a href="http://mlton.org/mailman/listinfo/mlton" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://mlton.org/mailman/listinfo/mlton</a><br><br></blockquote></div><br>

</span></div></blockquote></div><br>