[MLton-devel] size of closures
Matthew Fluet
fluet@CS.Cornell.EDU
Sat, 18 Jan 2003 15:53:52 -0500 (EST)
I reworked the StreamIO functor to cut down on allocation. I finally got
it on par with the existing IO (yeah!), but then I went back to clean up
some code, and I ended up blowing up the size by quite a bit. Here's the
relevant code:
datatype state = Link of {inp: V.vector, next: state ref}
| Eos of {next: state ref}
| End
| Truncated
| Closed
datatype instream = In of {common: {reader: reader,
augmented_reader: reader,
tail: state ref ref},
pos: int (* !state <> Link _ ==> pos = 0 *),
state: state ref}
fun update (In {common, ...}, pos, state) =
In {common = common,
pos = pos,
state = state}
fun updatePos (is as In {state, ...}, pos) = update (is, pos, state)
fun updateState (is, state) = update (is, 0, state)
Now, at first I had this version of input1:
fun input1 (is as In {pos, state, ...}) =
case !state of
Link {inp, next} =>
let
val (k, next) = if pos + 1 < V.length inp
then (pos + 1, state)
else (0, next)
in
SOME (V.sub (inp, pos), update (is, k, next))
end
| End =>
let val _ = extendB "input1" is
in input1 is
end
| _ => NONE
And I get:
[fluet@localhost cvs.HEAD.alloc]$ mlton.cvs.HEAD -profile alloc
-profile-basis true -profile-il source -profile-stack true -keep ssa
../wc-scanStream.sml ; ./wc-scanStream 30 ; mlprof -raw true -thresh 5
wc-scanStream mlmon.out
55,632,880 bytes allocated (2,336 bytes by GC)
function cur raw stack raw GC raw
------------------ ----- ------------ ------ ------------ ---- -------
<main> 0.0% (240) 100.0% (55,632,856) 0.0% (1,228)
<GC_arrayAllocate> 56.6% (31,468,192) 100.0% (55,632,700) 0.0% (1,228)
doit 0.0% (0) 100.0% (55,618,740) 0.0% (824)
array 0.0% (0) 56.6% (31,468,176) 0.0% (0)
array 0.0% (0) 56.6% (31,468,176) 0.0% (0)
wc 0.0% (0) 55.0% (30,614,280) 0.0% (0)
loop 0.0% (0) 55.0% (30,614,280) 0.0% (0)
scanStream 0.0% (0) 54.8% (30,485,760) 0.0% (0)
doit 0.0% (0) 54.8% (30,481,920) 0.0% (0)
extend 0.0% (0) 54.8% (30,481,920) 0.0% (0)
extendB 0.0% (0) 54.8% (30,481,920) 0.0% (0)
input1 0.0% (0) 54.8% (30,481,920) 0.0% (0)
loop 0.0% (0) 54.8% (30,481,920) 0.0% (0)
o 0.0% (0) 54.5% (30,343,320) 0.0% (0)
readVec 0.0% (0) 54.5% (30,334,680) 0.0% (0)
readVec 0.0% (0) 54.5% (30,317,040) 0.0% (0)
tabulate 0.0% (0) 43.1% (24,000,000) 0.0% (0)
loop 21.6% (12,000,720) 21.6% (12,001,104) 0.0% (0)
loop 21.6% (12,000,000) 21.6% (12,000,000) 0.0% (0)
Which is reasonable (or, at least on par with the old IO).
Now, I change input1 to the following, which I thought was a little
cleaner:
fun input1 (is as In {pos, state, ...}) =
case !state of
Link {inp, next} =>
let
val update = if pos + 1 < V.length inp
then fn () => updatePos (is, pos + 1)
else fn () => updateState (is, next)
in
SOME (V.sub (inp, pos), update ())
end
| End =>
let val _ = extendB "input1" is
in input1 is
end
| _ => NONE
And now I get:
[fluet@localhost cvs.HEAD.alloc]$ mlton.cvs.HEAD -profile alloc
-profile-basis true -profile-il source -profile-stack true -keep ssa ../wc
-scanStream.sml ; ./wc-scanStream 30 ; mlprof -raw true -thresh 5
wc-scanStream mlmon.out
1,015,633,360 bytes allocated (13,872 bytes by GC)
function cur raw stack raw GC raw
------------------ ----- ------------- ------ --------------- ---- --------
<main> 0.0% (240) 100.0% (1,015,633,336) 0.0% (12,764)
<GC_arrayAllocate> 3.1% (31,468,192) 100.0% (1,015,633,180) 0.0% (12,764)
doit 0.0% (0) 100.0% (1,015,619,220) 0.0% (12,360)
wc 0.0% (0) 97.5% (990,614,760) 0.0% (11,124)
loop 0.0% (0) 97.5% (990,614,760) 0.0% (11,124)
scanStream 47.3% (480,000,480) 97.5% (990,486,240) 0.0% (0)
input1 47.3% (480,000,000) 50.3% (510,481,920) 0.0% (0)
loop 0.0% (0) 50.3% (510,481,920) 0.0% (0)
Since updatePos and updateState should be inlined, I was very surprised by
this increase in allocation. I'm also a bit concerned about the fact that
scanStream's cur allocation increased, since I only modified input1.
I guess it is obvious to me that the two versions are semantically
equivalent. I suppose that MLton is unable to compare the two anonymous
functions to determine that a common closure of just an instream * int * state
dispatching to update could be used.
-------------------------------------------------------
This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will
allow you to extend the highest allowed 128 bit encryption to all your
clients even if they use browsers that are limited to 40 bit encryption.
Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel