[MLton-devel] size of closures

Matthew Fluet fluet@CS.Cornell.EDU
Sat, 18 Jan 2003 15:53:52 -0500 (EST)


I reworked the StreamIO functor to cut down on allocation.  I finally got
it on par with the existing IO (yeah!), but then I went back to clean up
some code, and I ended up blowing up the size by quite a bit.  Here's the
relevant code:

      datatype state = Link of {inp: V.vector, next: state ref}
	             | Eos of {next: state ref}
	             | End
	             | Truncated
	             | Closed
      datatype instream = In of {common: {reader: reader,
					  augmented_reader: reader,
					  tail: state ref ref},
				 pos: int (* !state <> Link _ ==> pos = 0 *),
				 state: state ref}

      fun update (In {common, ...}, pos, state) =
	In {common = common,
	    pos = pos,
	    state = state}
      fun updatePos (is as In {state, ...}, pos) = update (is, pos, state)
      fun updateState (is, state) = update (is, 0, state)

Now, at first I had this version of input1:

      fun input1 (is as In {pos, state, ...}) =
	case !state of
	  Link {inp, next} =>
	    let
	      val (k, next) = if pos + 1 < V.length inp
				then (pos + 1, state)
				else (0, next)
	    in
	      SOME (V.sub (inp, pos), update (is, k, next))
	    end
	| End =>
	    let val _ = extendB "input1" is
	    in input1 is
	    end
	| _ => NONE


And I get:

[fluet@localhost cvs.HEAD.alloc]$ mlton.cvs.HEAD -profile alloc
-profile-basis true -profile-il source -profile-stack true -keep ssa
../wc-scanStream.sml ; ./wc-scanStream 30 ; mlprof -raw true -thresh 5
wc-scanStream mlmon.out

55,632,880 bytes allocated (2,336 bytes by GC)
     function       cur      raw      stack      raw       GC    raw
------------------ ----- ------------ ------ ------------ ---- -------
<main>              0.0%        (240) 100.0% (55,632,856) 0.0% (1,228)
<GC_arrayAllocate> 56.6% (31,468,192) 100.0% (55,632,700) 0.0% (1,228)
doit                0.0%          (0) 100.0% (55,618,740) 0.0%   (824)
array               0.0%          (0)  56.6% (31,468,176) 0.0%     (0)
array               0.0%          (0)  56.6% (31,468,176) 0.0%     (0)
wc                  0.0%          (0)  55.0% (30,614,280) 0.0%     (0)
loop                0.0%          (0)  55.0% (30,614,280) 0.0%     (0)
scanStream          0.0%          (0)  54.8% (30,485,760) 0.0%     (0)
doit                0.0%          (0)  54.8% (30,481,920) 0.0%     (0)
extend              0.0%          (0)  54.8% (30,481,920) 0.0%     (0)
extendB             0.0%          (0)  54.8% (30,481,920) 0.0%     (0)
input1              0.0%          (0)  54.8% (30,481,920) 0.0%     (0)
loop                0.0%          (0)  54.8% (30,481,920) 0.0%     (0)
o                   0.0%          (0)  54.5% (30,343,320) 0.0%     (0)
readVec             0.0%          (0)  54.5% (30,334,680) 0.0%     (0)
readVec             0.0%          (0)  54.5% (30,317,040) 0.0%     (0)
tabulate            0.0%          (0)  43.1% (24,000,000) 0.0%     (0)
loop               21.6% (12,000,720)  21.6% (12,001,104) 0.0%     (0)
loop               21.6% (12,000,000)  21.6% (12,000,000) 0.0%     (0)


Which is reasonable (or, at least on par with the old IO).
Now, I change input1 to the following, which I thought was a little
cleaner:

      fun input1 (is as In {pos, state, ...}) =
	case !state of
	  Link {inp, next} =>
	    let
	      val update = if pos + 1 < V.length inp
			     then fn () => updatePos (is, pos + 1)
			     else fn () => updateState (is, next)
	    in
	      SOME (V.sub (inp, pos), update ())
	    end
	| End =>
	    let val _ = extendB "input1" is
	    in input1 is
	    end
	| _ => NONE

And now I get:

[fluet@localhost cvs.HEAD.alloc]$ mlton.cvs.HEAD -profile alloc
-profile-basis true -profile-il source -profile-stack true -keep ssa ../wc
-scanStream.sml ; ./wc-scanStream 30 ; mlprof -raw true -thresh 5
wc-scanStream mlmon.out

1,015,633,360 bytes allocated (13,872 bytes by GC)
     function       cur       raw      stack        raw        GC    raw
------------------ ----- ------------- ------ --------------- ---- --------
<main>              0.0%         (240) 100.0% (1,015,633,336) 0.0% (12,764)
<GC_arrayAllocate>  3.1%  (31,468,192) 100.0% (1,015,633,180) 0.0% (12,764)
doit                0.0%           (0) 100.0% (1,015,619,220) 0.0% (12,360)
wc                  0.0%           (0)  97.5%   (990,614,760) 0.0% (11,124)
loop                0.0%           (0)  97.5%   (990,614,760) 0.0% (11,124)
scanStream         47.3% (480,000,480)  97.5%   (990,486,240) 0.0%      (0)
input1             47.3% (480,000,000)  50.3%   (510,481,920) 0.0%      (0)
loop                0.0%           (0)  50.3%   (510,481,920) 0.0%      (0)

Since updatePos and updateState should be inlined, I was very surprised by
this increase in allocation.  I'm also a bit concerned about the fact that
scanStream's cur allocation increased, since I only modified input1.

I guess it is obvious to me that the two versions are semantically
equivalent.  I suppose that MLton is unable to compare the two anonymous
functions to determine that a common closure of just an instream * int * state
dispatching to update could be used.



-------------------------------------------------------
This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will
allow you to extend the highest allowed 128 bit encryption to all your 
clients even if they use browsers that are limited to 40 bit encryption. 
Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel