[MLton-user] more optimization questions
brian denheyer
briand@aracnet.com
Sun, 20 Nov 2005 17:33:24 -0800
On Nov 20, 2005, at 3:29 PM, Matthew Fluet wrote:
>
> It would be interesting to see the effect of -detect-overflow false.
>
A modest improvement:
real 0m12.700s
user 0m12.577s
sys 0m0.047s
> Which, by the way, is a good thing. I know everyone means well,
> but it isn't (always) a meaningful comparison to transliterate a C
> program into SML and expect the same performance out of MLton as
> out of GCC. I'd like to see someone transliterate Henry's count-
> graphs benchmark, which makes heavy use of higher order function
> and exceptions, into C and report back on mlton's vs gcc's
> performance.
Actually my program started in ML, and I translated it to C :-)
The metric which is driving my interest in this is not what C does,
it' what do I get for MFLOPS. However, C gives better MFLOPS so I
use it as a benchmark. I think this is an interesting problem in
that it's very easy to understand how to write these FD-like codes
very efficiently - so what keeps a higher order language from doing
the same ?
> Another thread along these lines starts here:
> http://mlton.org/pipermail/mlton/2005-March/026874.html
>
Very useful link , thank you .
> I seem to recall that at one point in time, we had inline assembly
> for overflow checking arithmetic in the (support code for the) C-
> codgen. When we had the native x86-codegen, we simplified that
> away, but it might be worthwhile to see what inline PowerPC
> assembly for overflow checking arithmetic gives you.
Doesn't setting overflow detection to false, take care of all that ?
BTW, Warning: -detect-overflow is deprecated. Use -const
'MLton.detectOverflow <value>'
>> Also I'm just plain curious as to what is going on. It's not
>> obvious to me that any of the optimizations being discussed are
>> worth a factor of 3.5 in performance, are they ?
>
> It's hard to say. There is an additional issue that, to GCC, all
> the C-code that MLton produces looks as though it is doing a lot of
> heap reads and writes, since MLton puts the ML stack on the heap.
> This means that GCC is probably being a bit conservative in it's
> alias analysis, and won't be able to do any of the loop
> optimizations for us.
Well - that would certainly make it difficult to extract better
performance out of ML without introducing native code.
I was thinking that one interesting project which might make use of
ML's strengths is a problem specific language something along the
lines of FFTW. Write a program which generates NATIVE CODE to run FD
codes. A domain-specific compiler as it were. I think this is a
very reasonable thing to try and do given the very narrow constraints
on what the problem solution looks like, i.e. loops with simple array
updates based on simple calculations.
Of course, you might then argue that it makes more sense to simply
have ML generate the C-code directly and that gets you to "just use
C". So maybe the right answer is to use C and use ML as the glue for
all of the higher level tasks which need to be done.
However this ignores a very important issue : programming in C is
just so darn
annoying !
Brian