[MLton] Question on profile.fun

Matthew Fluet fluet@cs.cornell.edu
Sun, 5 Jun 2005 23:35:39 -0400 (EDT)


> So, this seems to suggest that the slowdown due to missed SSA 
> optimizations is fairly low, though it is the cause of the insane behavior 
> of wc-scanStream.  Knowing that, it is probably worth adding to a TODO to 
> investigate.

A quick investigation turned up this surprising result.  I wrote a little
SSA pass to drop profiling expressions from an SSA IL program (and set
Control.profile to ProfileNone) and inserted it after every pass in the 
SSA optimization sequence.  I measured the running time of wc-scanStream 
when compiled with flags like
  -profile drop  -drop-pass "dropProfile[A-G]"
This means that profiling information is carried by the program until the
dropProfileH pass, after which there is no profiling information in the
program.

Here's what I observed.  The running time before  dropProfile*  is the 
running time when that instance of dropProfile is the first to run.

     1.53 dropProfileA
          removeUnused1
     1.54 dropProfileB
          leafInline
     1.52 dropProfileC
          contify1
     1.66 dropProfileD
          localFlatten1
     1.52 dropProfileE
          constantPropagation
     1.57 dropProfileF
          useless
     1.51 dropProfileG
          removeUnused2
     1.54 dropProfileH
          simplifyTypes
     1.52 dropProfileI
          polyEqual
     1.55 dropProfileJ
          contify2
     1.52 dropProfileK
          inline
     1.52 dropProfileL
          localFlatten2
     1.51 dropProfileM
          removeUnused3
     1.52 dropProfileN
          contify3
     1.87 dropProfileO
          introduceLoops
     1.99 dropProfileP
          loopInvariant
     1.90 dropProfileQ
          localRef
     1.98 dropProfileR
          flatten
    12.89 dropProfileS
          localFlatten3
    12.90 dropProfileT
          commonArg
    12.90 dropProfileU
          commonSubexp
    12.90 dropProfileV
          commonBlock
    12.91 dropProfileW
          redundantTests
    12.91 dropProfileX
          redundant
    12.91 dropProfileY
          knownCase
    12.88 dropProfileZ
          removeUnused4
    12.90 dropProfileAA

So, there is an insane shift after flatten and possibly a minor shift 
after contify3.  I don't know if this actually lays the blame squarely at 
the feet of Flatten.flatten (or possibly Shrink.shrinkFunction).  
Flatten.flatten doesn't appear to be sensitive to the presence of 
profiling statements in the program.  But, there is something definitely 
going on there.