[MLton] Question on profile.fun

Matthew Fluet fluet@cs.cornell.edu
Sun, 5 Jun 2005 14:09:05 -0400 (EDT)


> So, I admit that the bound on code-insertion for time-profiling isn't all 
> that good.  Again, we might argue that this is an upper-bound, since the 
> inserted code is more expensive than a single move.  I'll see about 
> queueing up that experiment.

MLton0 -- mlton -profile no
MLton1 -- mlton -profile drop 
MLton2 -- mlton -profile drop -profile-dummy mov
MLton3 -- mlton -profile drop -profile-dummy inc

        #       MLton3/MLton2 <=
        --      ----------------
        17      1.0
        35      1.1
        38      1.2

        outliers  1.22 life,  1.30 imp-for,  1.31 tailfib,  1.35 peek

So, for the most part, a mov vs. an inc is a wash.  The exceptions is 
where there is a tight loop that entails lots of updates to the gcState, 
in which case the mov does do better.  All in all, no surprises, and yet 
further evidence that code insertion is too intrusive for time profiling.


MLton0 -- mlton -profile no
MLton1 -- mlton -profile drop 
MLton2 -- mlton -profile drop -profile-dummy mov
MLton3 -- mlton -profile drop -profile-dummy inc
MLton4 -- mlton -profile label 
MLton5 -- mlton -profile label -profile-dummy mov
MLton6 -- mlton -profile label -profile-dummy inc
MLton7 -- mlton -profile time 
MLton8 -- mlton -profile time -profile-dummy mov
MLton9 -- mlton -profile time -profile-dummy inc
run time ratio
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut          1.00   1.06   1.10   1.09   1.07   1.10   1.06   1.03   1.02   1.04
boyer               1.00   1.09   1.12   1.10   1.09   1.16   1.14   1.00   1.11   1.12
checksum            1.00   1.70   1.84   1.75   1.68   1.80   1.75   1.68   1.84   1.75
count-graphs        1.00   1.03   1.19   1.15   1.09   1.16   1.18   1.08   1.17   1.18
DLXSimulator        1.00   1.00   1.03   1.04   0.99   1.03   0.90   0.96   0.98   1.05
fft                 1.00   0.99   1.00   1.04   1.03   1.06   1.04   1.08   1.11   1.11
fib                 1.00   1.36   1.41   1.44   1.41   1.49   1.50   1.39   1.59   1.51
flat-array          1.00   1.00   1.06   1.06   1.06   0.97   1.09   0.86   0.89   0.91
hamlet              1.00   1.06   1.11   1.12   1.11   1.14   1.15   1.12   1.16   1.17
imp-for             1.00   1.00   1.81   2.35   1.00   1.46   2.32   0.99   1.45   2.32
knuth-bendix        1.00   1.20   1.34   1.41   1.25   1.27   1.34   1.25   1.28   1.34
lexgen              1.00   1.05   1.10   1.11   1.08   1.10   1.15   1.06   1.08   1.13
life                1.00   1.15   1.41   1.72   1.31   1.60   1.79   1.35   1.64   1.83
logic               1.00   1.02   1.08   1.06   1.04   1.10   1.10   1.05   1.11   1.11
mandelbrot          1.00   1.04   1.09   1.12   1.04   1.15   1.11   0.68   0.71   0.77
matrix-multiply     1.00   0.99   1.11   1.13   0.98   1.10   1.12   0.89   1.18   1.25
md5                 1.00   1.40   1.57   1.62   1.45   1.67   1.83   1.45   1.67   1.83
merge               1.00   1.00   1.03   1.03   1.00   1.03   1.03   1.01   1.04   1.04
mlyacc              1.00   1.03   1.05   1.06   1.04   1.08   1.08   1.05   1.08   1.09
model-elimination   1.00   1.05   1.11   1.08   1.07   1.12   1.11   1.08   1.13   1.12
mpuz                1.00   1.00   1.24   1.33   1.02   1.26   1.33   1.02   1.26   1.34
nucleic             1.00   0.99   1.01   1.01   0.99   1.01   1.02   1.08   1.02   1.02
output1             1.00   0.97   1.06   1.18   0.97   1.06   1.18   0.97   1.06   1.18
peek                1.00   1.25   1.50   2.02   1.26   1.50   2.00   1.25   1.50   2.00
psdes-random        1.00   1.10   1.38   1.60   1.10   1.36   1.59   1.10   1.41   1.63
ratio-regions       1.00   1.13   1.24   1.26   1.16   1.27   1.31   1.17   1.29   1.32
ray                 1.00   1.07   1.14   1.09   1.07   1.10   1.12   1.05   1.08   1.12
raytrace            1.00   1.03   1.14   1.10   1.05   1.17   1.12   1.06   1.17   1.13
simple              1.00   1.00   1.11   1.13   1.02   1.12   1.15   0.94   1.05   1.07
smith-normal-form   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.01   1.01   1.01
tailfib             1.00   0.96   1.16   1.52   0.96   1.16   1.52   0.96   1.16   1.52
tak                 1.00   1.36   1.32   1.37   1.44   1.39   1.47   1.43   1.39   1.46
tensor              1.00   0.84   1.25   1.49   0.84   1.25   1.49   0.84   1.25   1.49
tsp                 1.00   1.01   1.03   1.04   1.03   1.03   1.04   1.03   1.03   1.04
tyan                1.00   1.07   1.11   1.11   1.08   1.12   1.12   1.09   1.12   1.12
vector-concat       1.00   1.01   1.01   1.01   1.01   1.01   1.01   1.02   1.03   1.11
vector-rev          1.00   0.66   0.65   0.64   0.61   0.67   0.84   0.69   0.64   0.65
vliw                1.00   1.09   1.13   1.08   1.24   1.19   1.14   1.10   1.21   1.16
wc-input1           1.00   1.24   1.28   1.38   1.20   1.21   1.42   1.18   1.19   1.41
wc-scanStream       1.00   8.45   8.35   8.44   8.16   7.26   7.29   7.14   7.23   7.30
zebra               1.00   1.05   1.50   1.64   1.29   1.50   1.58   1.27   1.52   1.62
zern                1.00   1.02   1.52   1.24   1.10   1.36   1.60   1.10   1.20   1.23
size
benchmark            MLton0    MLton1    MLton2    MLton3    MLton4    MLton5    MLton6    MLton7    MLton8    MLton9
barnes-hut           99,700   120,058   125,594   122,938   145,314   150,690   148,194   149,084   154,368   151,976
boyer               135,375   162,147   177,267   171,379   206,323   221,539   215,699   213,565   228,749   222,925
checksum             50,095    52,627    52,979    52,835    55,755    56,107    55,979    62,725    63,061    62,949
count-graphs         63,135    76,263    82,239    78,911    93,783    99,719    96,423   102,129   107,777   104,657
DLXSimulator        126,067   165,179   179,371   172,091   216,299   229,819   222,707   221,613   235,117   228,021
fft                  61,358    67,718    70,662    68,934    75,982    78,878    77,150    84,232    87,176    85,384
fib                  44,691    47,047    47,399    47,207    50,143    50,495    50,351    57,129    57,481    57,337
flat-array           44,715    46,999    47,287    47,159    50,143    50,415    50,303    57,113    57,369    57,273
hamlet            1,246,854 1,913,294 2,204,158 2,061,582 2,699,934 2,996,302 2,858,062 2,704,854 3,001,174 2,862,966
imp-for              44,547    47,711    48,767    48,159    52,551    53,623    53,031    59,457    60,545    59,937
knuth-bendix        105,907   133,055   142,551   137,823   165,595   175,003   170,571   172,517   181,861   177,637
lexgen              199,332   270,180   296,204   283,412   358,500   384,420   372,220   361,020   386,956   374,724
life                 62,059    71,295    74,711    72,823    85,407    88,799    86,879    91,801    95,225    93,305
logic               103,567   133,503   146,847   138,303   172,679   186,311   177,751   179,857   193,441   184,913
mandelbrot           44,643    47,055    47,471    47,295    50,215    50,599    50,439    57,153    57,537    57,393
matrix-multiply      46,294    49,750    51,062    50,166    54,414    55,742    54,846    61,200    62,496    61,632
md5                  74,531    84,991    87,775    85,647    97,431   100,215    98,151   103,953   106,593   104,721
merge                46,271    49,279    49,855    49,599    53,255    53,815    53,575    60,161    60,721    60,497
mlyacc              501,140   696,352   776,376   733,048   933,496 1,013,704   971,240   935,984 1,016,192   973,728
model-elimination   631,901   884,693   987,165   933,909 1,236,233 1,338,401 1,286,281 1,241,211 1,343,459 1,291,339
mpuz                 47,307    53,543    57,159    54,695    60,927    64,495    62,015    69,345    72,081    70,289
nucleic             196,246   208,546   211,250   209,810   221,650   224,370   223,026   228,420   231,124   229,812
output1              77,373    84,953    86,001    85,561    97,689    98,769    98,377    99,465   100,593   100,169
peek                 73,483    81,495    82,607    82,231    92,551    93,695    93,335    97,649    98,665    98,321
psdes-random         45,355    48,407    49,047    48,727    52,255    52,895    52,591    59,177    59,817    59,513
ratio-regions        70,387   102,063   116,223   108,191   126,567   140,599   132,791   135,497   149,577   141,769
ray                 178,284   226,868   237,900   231,620   291,556   302,876   296,644   298,444   309,796   303,532
raytrace            260,497   326,209   361,697   342,833   440,625   476,065   457,457   441,497   477,001   458,473
simple              219,103   302,895   339,863   317,871   379,215   416,335   394,463   383,857   421,137   399,249
smith-normal-form   178,867   196,427   204,379   199,259   220,459   228,411   223,275   225,973   233,957   228,837
tailfib              44,387    46,679    47,015    46,855    49,727    50,063    49,919    56,681    57,001    56,873
tak                  44,771    46,943    47,215    47,071    49,959    50,215    50,119    56,993    57,249    57,153
tensor               94,850   117,762   131,282   123,322   156,514   169,314   161,466   161,940   174,740   166,900
tsp                  79,059    90,295    94,231    91,895   105,999   109,967   107,679   112,377   116,233   114,121
tyan                132,123   175,379   193,723   182,851   232,427   250,931   240,331   238,141   256,637   246,101
vector-concat        45,971    48,375    48,567    48,503    52,047    52,239    52,191    59,001    59,177    59,145
vector-rev           45,199    47,863    48,199    48,039    51,311    51,647    51,503    58,313    58,633    58,505
vliw                387,187   668,207   744,447   706,111   916,471   993,567   955,039   919,071   996,151   957,623
wc-input1            99,071   115,083   116,979   116,139   138,827   140,659   140,027   140,963   142,795   142,163
wc-scanStream       106,215   123,483   125,523   124,619   151,215   153,175   152,495   153,415   155,351   154,695
zebra               121,515   157,363   185,627   174,371   264,243   289,723   277,699   270,589   295,941   284,125
zern                 85,796    94,308    97,908    95,460   106,444   110,060   107,724   115,222   118,918   116,534
compile time
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut          9.90  10.10  11.46  11.44  11.59  11.37  10.38  11.63  11.87  11.19
boyer              10.58  11.22  12.16  12.23  12.05  11.50  12.40  12.11  11.48  11.86
checksum            7.55   7.68   7.31   7.71   7.72   7.64   7.10   7.36   7.93   6.75
count-graphs        7.81   8.85   7.90   8.49   8.86   9.17   9.25   9.34   8.18   9.48
DLXSimulator       11.94  12.59  12.02  13.10  12.87  13.94  12.29  12.01  12.30  12.64
fft                 8.15   7.29   7.26   7.34   7.49   7.52   7.75   7.83   7.88   7.95
fib                 6.79   7.20   7.28   7.12   7.49   7.87   6.86   7.42   7.60   7.87
flat-array          7.04   7.14   6.98   7.09   6.85   7.43   7.03   6.59   6.62   6.70
hamlet             62.42  75.16  77.19  79.42  86.85  89.46  91.90  87.08  89.36  92.12
imp-for             6.26   6.41   6.40   6.59   7.78   7.79   6.50   6.66   6.65   6.70
knuth-bendix        8.14   8.85  10.88  11.07  11.29   9.33   9.34   9.41   9.45   9.47
lexgen             11.71  13.20  13.44  13.56  14.32  14.48  14.69  14.23  14.50  14.68
life                6.94   7.16   7.21   7.19   7.28   7.26   7.33   7.45   7.45   7.49
logic               8.37   8.93   9.08   9.21   9.38   9.55   9.68   9.64   9.68   9.87
mandelbrot          6.27   6.41   6.51   6.46   6.47   6.54   6.53   6.79   6.65   6.61
matrix-multiply     6.39   6.58   6.60   6.58   6.64   6.59   6.60   6.82   6.80   6.80
md5                 7.15   7.60   7.60   7.55   7.70   7.77   7.71   7.92   7.86   7.89
merge               6.33   6.46   6.46   6.44   6.50   6.53   6.51   6.69   6.65   6.71
mlyacc             27.03  31.69  32.33  32.77  34.92  35.74  36.21  35.03  35.71  36.26
model-elimination  29.45  35.32  36.12  36.90  40.26  41.08  41.59  40.47  41.21  41.89
mpuz                6.41   6.62   6.64   6.61   6.74   6.72   6.94   6.86   6.96   6.92
nucleic            13.84  14.11  14.14  14.33  14.25  14.18  14.32  14.49  14.62  14.54
output1             7.13   7.41   7.46   7.44   7.55   7.61   7.60   7.54   7.61   7.64
peek                6.99   7.37   7.48   7.42   9.20   7.50   7.53   7.64   7.63   7.63
psdes-random        6.29   6.48   6.48   6.50   6.54   6.58   6.55   6.73   6.73   6.73
ratio-regions       7.61   8.21   8.35   8.40   8.52   8.60   8.75   8.80   8.90   8.96
ray                10.38  11.85  11.95  12.01  12.64  12.74  12.81  12.88  12.97  12.98
raytrace           14.89  16.56  16.82  17.09  18.02  18.34  18.51  17.99  18.30  18.66
simple             12.26  13.56  13.84  14.05  14.53  14.81  14.96  14.61  14.88  15.22
smith-normal-form  10.56  11.19  11.38  11.43  11.54  11.68  11.71  11.75  11.73  11.79
tailfib             6.24   6.37   6.38   6.38   6.45   6.61   6.45   6.66   6.61   6.58
tak                 6.19   6.39   6.39   6.42   6.41   6.40   6.39   6.59   6.61   6.65
tensor              8.94   9.97  10.01  10.06  10.34  10.42  10.57  10.51  10.58  10.72
tsp                 7.43   7.96   8.03   8.01   8.11   8.16   8.17   8.29   8.34   8.37
tyan                9.75  10.94  11.16  11.27  11.72  11.84  12.01  11.87  11.97  12.12
vector-concat       6.34   6.51   6.49   6.48   6.51   6.61   6.57   6.68   6.70   6.76
vector-rev          9.72   9.96   8.36   6.44   6.49   6.47   7.93   8.12   6.70   6.64
vliw               20.52  27.12  27.67  28.16  30.44  30.98  31.46  30.50  31.12  31.47
wc-input1           7.92   8.55   8.57   8.53   8.84   8.89   8.86   8.86   8.87   8.93
wc-scanStream       8.25   8.87  10.73   8.87  11.11   9.25   9.24   9.25   9.31   9.36
zebra               9.34  10.30  10.55  10.70  11.27  11.46  11.68  11.50  11.67  11.85
zern                6.98   7.50   9.02   8.05   8.27   7.98  11.08  10.93   8.31   8.39
run time
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9
barnes-hut         57.24  60.61  62.89  62.36  61.42  62.92  60.94  59.01  58.45  59.28
boyer              60.94  66.68  68.31  66.93  66.57  70.50  69.49  60.80  67.46  68.27
checksum          106.87 181.15 196.39 186.62 180.07 192.27 186.60 179.62 196.48 187.28
count-graphs       48.46  50.00  57.75  55.52  52.84  55.99  57.42  52.16  56.59  57.20
DLXSimulator      102.96 103.28 105.92 107.19 102.11 106.13  93.10  99.22 100.93 108.54
fft                37.00  36.63  36.91  38.31  37.95  39.35  38.47  39.79  40.98  41.12
fib                70.90  96.63 100.12 101.97 100.29 105.91 106.08  98.24 112.69 107.40
flat-array         31.95  32.09  34.01  33.88  33.92  31.11  34.87  27.55  28.50  29.20
hamlet             53.17  56.22  58.79  59.48  59.18  60.61  61.39  59.74  61.93  62.06
imp-for            46.97  46.98  85.03 110.53  46.83  68.45 108.97  46.38  68.09 109.04
knuth-bendix       38.73  46.54  52.05  54.80  48.51  49.22  51.70  48.41  49.48  52.00
lexgen             44.58  46.75  49.08  49.35  47.95  48.83  51.13  47.11  48.08  50.28
life               14.49  16.59  20.44  24.95  19.01  23.15  25.91  19.51  23.73  26.55
logic              55.79  56.95  60.01  59.13  58.19  61.53  61.38  58.51  61.79  61.70
mandelbrot         82.74  86.12  90.43  92.88  86.12  94.78  91.62  56.24  58.49  63.65
matrix-multiply     7.68   7.58   8.54   8.64   7.52   8.46   8.60   6.84   9.08   9.57
md5                53.40  74.92  83.85  86.62  77.61  89.20  97.46  77.68  89.22  97.62
merge              85.59  85.54  87.99  87.91  85.50  88.50  88.30  86.03  88.79  88.94
mlyacc             41.05  42.13  43.13  43.40  42.88  44.16  44.38  43.09  44.42  44.59
model-elimination  80.34  84.06  88.83  87.13  86.02  90.37  89.34  86.76  90.58  90.36
mpuz               41.98  41.92  51.95  55.95  42.99  52.85  55.96  42.91  52.93  56.09
nucleic            45.62  45.22  46.11  46.27  45.03  46.25  46.45  49.45  46.40  46.61
output1            15.28  14.89  16.21  18.01  14.88  16.21  18.00  14.89  16.23  18.01
peek               35.59  44.45  53.36  71.85  44.88  53.34  71.16  44.46  53.37  71.16
psdes-random       39.00  43.00  54.00  62.33  43.01  53.02  62.05  43.04  55.07  63.41
ratio-regions      53.19  60.09  65.78  67.26  61.65  67.79  69.45  61.97  68.71  70.11
ray                30.81  32.93  35.10  33.53  32.89  33.88  34.41  32.33  33.36  34.38
raytrace           42.70  44.12  48.52  46.96  44.87  49.77  47.67  45.23  50.00  48.09
simple             60.58  60.83  67.15  68.24  61.94  68.14  69.51  57.09  63.34  64.72
smith-normal-form  37.07  37.20  37.17  37.16  37.13  37.17  37.16  37.37  37.41  37.33
tailfib            43.94  41.98  50.87  66.67  41.97  50.88  66.71  42.01  50.91  66.75
tak                27.71  37.69  36.51  38.05  39.88  38.44  40.60  39.57  38.49  40.59
tensor             59.18  49.48  73.70  88.37  49.51  73.75  88.41  49.53  73.75  88.39
tsp                64.71  65.23  66.66  67.07  66.74  66.73  67.19  66.75  66.89  67.27
tyan               57.99  62.09  64.14  64.23  62.60  64.83  64.93  63.20  65.14  65.23
vector-concat      91.58  92.20  92.65  92.22  92.64  92.65  92.42  93.48  94.30 101.66
vector-rev        192.95 126.75 125.78 123.88 117.43 129.67 162.24 133.67 123.70 125.50
vliw               53.31  57.85  60.18  57.35  66.30  63.35  60.80  58.73  64.36  61.87
wc-input1          38.84  48.06  49.89  53.64  46.67  46.92  55.30  45.84  46.05  54.65
wc-scanStream      34.20 289.13 285.63 288.64 279.21 248.43 249.22 244.16 247.24 249.64
zebra              40.66  42.88  60.91  66.80  52.33  61.03  64.26  51.81  61.64  65.76
zern               41.36  42.37  62.74  51.16  45.63  56.37  66.07  45.30  49.46  50.87