fully polyvariant self-compile

Thu, 17 May 2001 16:13:06 -0700 (PDT)

> My  friend  Rico  has  4  machines,  each with 4 gig of RAM.  I wanted to try
> compiling MLton without the -no-polyvariance flag to see if it would fit, but
> sadly  I  can't  ask  for  a  heap  size  bigger  than  2 gig because the int
> overflows.  Also on these machines /proc/meminfo is  going  to  hold  numbers
> bigger  than  will  fit  in  an  int.   Either  this  has  to  be  done  with
> IntInf.int's, or Word32.word's, or else it is another use  for  Int64  (along
> with  file  positions).   Could you produce something I could try with any of
> these?  I'm really curious if the self-compile with polyvariance would  work,
> and if it would result in a faster compiler.

My feeling is that we need to add support for Int64.  Maybe Matthew and I can do 
that this summer.

> I  went  to my friend Rico's place and used his machines with 4 gig of Ram to
> compile a version of MLton without  using  the  -no-polyvariance  flag.   The
> result  was  that  it compiled fine using a fixed heap of 2000m.  Then I used
> the result to do a second self-compile, again with a  fixed  heap  of  2000m.
> The  ratio of old time / new time was 1.4.  The ratio of new text size (i.e.,
> the first number output by the size program) / old text size was also 1.4.

This jibes with what I have seen before in terms of both running time and
speedup.

> All of this is with the 2001-03-21 version of MLton.  I'm quite certain  that
> at  some  point in the past I tried to do a self compile on Siskind's machine
> with 2 gig of RAM and it didn't fit,  so  some  change  in  the  compiler  is
> clearly  at  least  partly  responsible.   (Note,  that  was a long time ago.
> Certainly way before the native back end.)

There have been lots of improvements over the last two years -- one of the most
relevant is probably the inliner, which has better size metrics to prevent
blowup.

> Encoraged  by  my  test on Rico's 4-gig machines, I just tried to do the same
> thing (self-compile MLton without the -no-polyvariance  flag)  on  a  machine
> with 1 gig of RAM.  It worked fine.
> 
> Here  is  the output of using stock 2001-03-21 MLton to compile mlton with no
> -no-polyvariance flag:
...
> 	GC time(ms): 380,060 (42.3%)
> 	maxPause(ms): 6,640
> 	number of GCs: 341
> 	bytes allocated: 41,940,453,700
> 	bytes copied: 11,292,451,360
> 	max bytes live: 188,775,248
> 
> 	real	15:24.38
> 	user	913.46
> 	sys	8.35
> 
> and  here  is  the result of using what that produced to compile mlton, again
> with no -polyvariance flag:
...
> 	GC time(ms): 380,060 (42.3%)
> 	maxPause(ms): 6,640
> 	number of GCs: 341
> 	bytes allocated: 41,940,453,688
> 	bytes copied: 11,292,451,376
> 	max bytes live: 188,775,248
> 
> 	real	15:23.38
> 	user	915.76
> 	sys	7.60
> 
> So it didn't speed things up any on this machine.  I would guess that this is
> due to GC being pretty dominant.

This makes no sense to me.  I claim there is some problem in your experiment and 
that you actually did the same thing in both cases.  Note that the GC time is
exactly the same in both cases (and hence so is the non GC time).  There should
be some difference more like what you saw in your first experiment.

Anyways, my current feeling on the self compile is that one should compare
-no-polyvariance compiling -no-polyvariance to the version with polyvariance
compiling with polyvariance.  I believe the former is still significantly
faster, and so I'll stick with -no-polyvariance for now.