Code expansion due to monomorphization in MLton?

Stephen Weeks MLton@sourcelight.com
Wed, 10 Jan 2001 17:46:05 -0800 (PST)


> I recently had an email conversation with Suresh about
> code sizes in MLton.  I am very impressed by the sizes 
> that are reported at: 
> 
> http://www.sourcelight.com/MLton/HTML/node12.html

Thanks.  The code size is so small is due to the dead code elimination used at
several passes of the compiler.

> I am curious about how code size changes between the various
> stages of the MLton compiler.  For instance, do you have any 
> numbers on worst-case or average-case code expansion factors 
> due to (1) the monomorphization stage and (2) the
>  defunctionalization stage of MLton? 
> 
> Thank you for any information you can provide!

There are a a few old measurements of syntax tree size increase due to
monomorphisation on page 11 of 
	http://www.star-lab.com/sweeks/papers/99-intertrust.ps
These are from a talk I gave a couple of years ago, but I don't think things
have changed too much since then.  Basically, on most programs, monomorphisation 
doesn't increase code size at all, or even decreases it a bit due to subsequent
simplification.  On some heavily polymorphic programs (MLton being one such),
you can see up to 60% blowup.  I don't remember ever seeing more than a factor
of two on a real program.

I should also mention that its easy for you to get new measurements on any
program you want.  If you compile a file with "mlton -v", then various
interesting info about intermediate languages will be printed out.  In
particular, you can see the number of syntax tree nodes as well as the size in
bytes of the syntax trees.  At the end of this message is a log of a self
compile that I just did today.  It will look slightly different than a log you
may see since I am using a newer version of mlton, but it should be pretty
similar.  Anyways, here's what the log shows about the blowup due to
monomorphisation (called "mono" in the log).

			    before	     after
syntax tree size bytes	21,076,804	38,878,996
syntax tree # of nodes	   123,965	   210,874

As to defunctionalization, I don't have any old numbers.  It's also a bit dicier 
to compare, since the ILs before and after are more different than with
monomorphisation.  But, you can get something from "mlton -v".  Here are the
sizes before and after defunctionalization (called "closure convert" in the
log).

			    before	     after
syntax tree size bytes	37,070,988	55,193,708	
syntax tree # of nodes	   197,544	   154,193 (= 1003 + 63424 + 89766)

The syntax tree # of nodes comparison is probably bogus, but the size in bytes
comparison probably isn't too unreasonable.

> I hope all is well with you. Are you still at InterTrust? 

Thanks.  Yep, I'm still at InterTrust, although I'm sure I'm spending more time
on MLton than my employer would like :-)

We'd (Henry, Suresh, Matthew, and me) be interested to hear sometime about
what's going on with CIL and your group's SML compiler.  I check the Church
group mailing list archives from time to time, but haven't seen too much about a
working compiler or benchmarks.

Here's the log.  Have fun.

--------------------------------------------------------------------------------

MLton internal (built Tue Jan  9 18:53:23 2001 on starlinux.epr.com)
  created this file on Wed Jan 10 17:13:54 2001.
Do not edit this file.
Flag settings: 
   aux: false
   chunk: chunk per function
   contify strategy: Both
   debug: false
   defines: [NODEBUG,MLton_safe=TRUE,MLton_detectOverflow=TRUE]
   detect overflow: true
   fixed heap: None
   indentation: 3
   includes: [mlton.h]
   inline: NonRecursive {product = 320,small = 60}
   input file: mlton.cm
   instrument: false
   instrument Sxml: false
   keep Cps: false
   match: left to right
   native: true
   native commented: 0
   native copy prop: true
   future: 64
   native ieee fp: false
   native live transfer: false
   native move hoist: true
   native optimize: 1
   native split: Some(100000)
   polyvariance: None
   print at fun entry: false
   profile: false
   safe: true
   show types: false
   static: false
   use basis library: true
   verbose: true
   very verbose: true
Compile SML starting
   parse and elaborate starting
   parse and elaborate finished in 12.110
   core-ml size is 11,276,392 bytes
   numPeeks = 14
   average position in property list = 0.0
   numPeeks = 297597
   average position in bucket = 0.732
   lex and parse totals 10.550
   elaborate totals 1.440
   dead starting
   dead finished in 0.090
   basis size is 803,616 bytes
   numPeeks = 74673
   average position in property list = 0.0
   numPeeks = 297597
   average position in bucket = 0.732
   size = 191557
   gcc -DNODEBUG -DMLton_safe=TRUE -DMLton_detectOverflow=TRUE \
       -I/home/sweeks/mlton/include -o /tmp/filegjrFeE /tmp/fileoj1Map.c \
       -L/home/sweeks/mlton/lib -lmlton -lm -lgmp
   /tmp/filegjrFeE /tmp/file8SdK15
   infer starting
      unification starting
      unification finished in 1.780
      finish infer starting
      finish infer finished in 22.060
   infer finished in 24.710
   xml.unsimplified size is 37,704,428 bytes
   numPeeks = 1110205
   average position in property list = 0.000
   numPeeks = 439645
   average position in bucket = 0.901
   infer simplify starting
   infer simplify finished in 3.650
   xml size is 21,076,804 bytes
   numPeeks = 3358678
   average position in property list = 0.100
   numPeeks = 439645
   average position in bucket = 0.901
   size = 123965
   num types in program = 21951
   num distinct types = 37138
   hash table size is 0 bytes
   mono starting
   mono finished in 6.970
   mono.unsimplified size is 49,208,176 bytes
   numPeeks = 8353138
   average position in property list = 0.040
   numPeeks = 1248567
   average position in bucket = 1.709
   mono simplify starting
   mono simplify finished in 6.260
   mono size is 38,878,996 bytes
   numPeeks = 11783665
   average position in property list = 0.079
   numPeeks = 1248567
   average position in bucket = 1.709
   size = 210874
   num types in program = 14682
   num distinct types = 70569
   hash table size is 0 bytes
   implement exceptions starting
   implement exceptions finished in 1.590
   sxml.unsimplified size is 39,352,420 bytes
   numPeeks = 12110328
   average position in property list = 0.077
   numPeeks = 1250299
   average position in bucket = 1.709
   implement exceptions simplify starting
   implement exceptions simplify finished in 4.140
   sxml size is 37,070,988 bytes
   numPeeks = 14744857
   average position in property list = 0.088
   numPeeks = 1250299
   average position in bucket = 1.709
   polyvariance starting
   polyvariance finished in 0.0
   sxml.poly size is 37,070,988 bytes
   numPeeks = 14744857
   average position in property list = 0.088
   numPeeks = 1250299
   average position in bucket = 1.709
   size = 197544
   num types in program = 14242
   num distinct types = 70868
   hash table size is 0 bytes
   closure convert starting
      original size is 37,070,988 bytes
      after flow analysis size is 67,772,664 bytes
      free variables starting
      free variables finished in 3.280
      after lambda free size is 68,526,924 bytes
      globalize starting
      globalize finished in 0.380
      after globalize size is 68,190,932 bytes
      convert starting
      convert finished in 32.350
   closure convert finished in 45.680
   cps.unsimplified size is 73,751,844 bytes
   numPeeks = 20900956
   average position in property list = 0.948
   numPeeks = 1319961
   average position in bucket = 1.658
   closure convert simplify starting
      num functions 13742
      num local functions 154411
      num primExps 172668
      removeUnused starting
      removeUnused finished in 4.840
      num functions 11896
      num local functions 90414
      num primExps 154232
      leaf-inline starting
	 inline starting
	 inline finished in 7.420
      leaf-inline finished in 7.420
      num functions 8989
      num local functions 63124
      num primExps 152320
      raise-to-jump starting
	 inferHandlers starting
	 inferHandlers finished in 0.170
      raise-to-jump finished in 4.980
      num functions 8989
      num local functions 62688
      num primExps 152297
      contify starting
      contify finished in 3.440
      num functions 3929
      num local functions 58319
      num primExps 142468
      constantPropagation starting
	 inferHandlers starting
	 inferHandlers finished in 0.150
	 fixed point starting
	 fixed point finished in 4.210
      constantPropagation finished in 9.010
      num functions 3929
      num local functions 57665
      num primExps 104326
      useless starting
	 analyze starting
	 analyze finished in 8.350
      useless finished in 12.980
      num functions 3929
      num local functions 54322
      num primExps 94248
      removeUnused starting
      removeUnused finished in 0.650
      num functions 3861
      num local functions 52900
      num primExps 91523
      simplifyTypes starting
	 fixed point starting
	 fixed point finished in 0.080
      simplifyTypes finished in 3.250
      num functions 3861
      num local functions 43988
      num primExps 88116
      poly-equal starting
      poly-equal finished in 0.170
      num functions 3873
      num local functions 44640
      num primExps 88650
      contify starting
      contify finished in 0.840
      num functions 3712
      num local functions 44591
      num primExps 88458
      inline starting
      inline finished in 6.140
      num functions 1004
      num local functions 68568
      num primExps 139890
      removeUnused starting
      removeUnused finished in 0.860
      num functions 1004
      num local functions 66294
      num primExps 138773
      raise-to-jump starting
	 inferHandlers starting
	 inferHandlers finished in 0.170
      raise-to-jump finished in 3.600
      num functions 1004
      num local functions 66247
      num primExps 138756
      contify starting
      contify finished in 3.240
      num functions 1003
      num local functions 66245
      num primExps 138754
      introduce-loops starting
      introduce-loops finished in 0.040
      num functions 1003
      num local functions 66270
      num primExps 138754
      loop-invariant starting
      loop-invariant finished in 3.190
      num functions 1003
      num local functions 63649
      num primExps 131256
      flatten starting
	 analyze starting
	 analyze finished in 0.160
      flatten finished in 3.850
      num functions 1003
      num local functions 63737
      num primExps 91451
      redundant starting
      redundant finished in 3.080
      num functions 1003
      num local functions 63737
      num primExps 91451
      removeUnused starting
      removeUnused finished in 0.740
      num functions 1003
      num local functions 63424
      num primExps 89766
   closure convert simplify finished in 86.180
   cps size is 55,193,708 bytes
   numPeeks = 54324230
   average position in property list = 0.453
   numPeeks = 1605207
   average position in bucket = 1.844