[MLton-user] more optimization questions

Sun, 20 Nov 2005 17:33:24 -0800

On Nov 20, 2005, at 3:29 PM, Matthew Fluet wrote:
>

> It would be interesting to see the effect of  -detect-overflow false.
>

A modest improvement:

real    0m12.700s
user    0m12.577s
sys     0m0.047s

> Which, by the way, is a good thing.  I know everyone means well,  
> but it isn't (always) a meaningful comparison to transliterate a C  
> program into SML and expect the same performance out of MLton as  
> out of GCC.  I'd like to see someone transliterate Henry's count- 
> graphs benchmark, which makes heavy use of higher order function  
> and exceptions, into C and report back on mlton's vs gcc's  
> performance.

Actually my program started in ML, and I translated it to C :-)

The metric which is driving my interest in this is not what C does,  
it' what do I get for MFLOPS.  However, C gives better MFLOPS so I  
use it as a benchmark.  I think this is an interesting problem in  
that it's very easy to understand how to write these FD-like codes  
very efficiently - so what keeps a higher order language from doing  
the same ?

> Another thread along these lines starts here:
>   http://mlton.org/pipermail/mlton/2005-March/026874.html
>

Very useful link , thank you .

> I seem to recall that at one point in time, we had inline assembly  
> for overflow checking arithmetic in the (support code for the) C- 
> codgen.  When we had the native x86-codegen, we simplified that  
> away, but it might be worthwhile to see what inline PowerPC  
> assembly for overflow checking arithmetic gives you.

Doesn't setting overflow detection to false, take care of all that ?
BTW, Warning: -detect-overflow is deprecated.  Use -const  
'MLton.detectOverflow <value>'

>> Also I'm just plain curious as to what is going on.  It's not  
>> obvious to me that any of the optimizations being discussed are  
>> worth a factor of 3.5 in performance, are they ?
>
> It's hard to say.  There is an additional issue that, to GCC, all  
> the C-code that MLton produces looks as though it is doing a lot of  
> heap reads and writes, since MLton puts the ML stack on the heap.   
> This means that GCC is probably being a bit conservative in it's  
> alias analysis, and won't be able to do any of the loop  
> optimizations for us.

Well - that would certainly make it difficult to extract better  
performance out of ML without introducing native code.

I was thinking that one interesting project which might make use of  
ML's strengths is a problem specific language something along the  
lines of FFTW.  Write a program which generates NATIVE CODE to run FD  
codes.  A domain-specific compiler as it were.  I think this is a  
very reasonable thing to try and do given the very narrow constraints  
on what the problem solution looks like, i.e. loops with simple array  
updates based on simple calculations.

Of course, you might then argue that it makes more sense to simply  
have ML generate the C-code directly and that gets you to "just use  
C".  So maybe the right answer is to use C and use ML as the glue for  
all of the higher level tasks which need to be done.

However this ignores a very important issue :  programming in C is  
just so darn
annoying !

Brian