[MLton] max-heap setting for 64-bit applications
Matthew Fluet
matthew.fluet at gmail.com
Mon Dec 14 08:50:48 PST 2009
On Sat, Dec 12, 2009 at 8:24 AM, Wesley W. Terpstra <wesley at terpstra.ca> wrote:
> On Sat, Dec 12, 2009 at 8:48 AM, Matthew Fluet <matthew.fluet at gmail.com>
> wrote:
>>
>> It isn't quite sequential access.
>
> Sorry. You're both wrong. ;) The mremap code does a memcpy, which is 100%
> sequential. That's the only transition between the two heaps on Windows ->
> inside mremap. It allocates the new heap, memcopies them, then frees the old
> heap. The only time the GC touches it is when it fixes up the addresses.
But, the Windows specific mremap could still fail --- note that the
growHeap function demands significant growth from mremap. If that
fails, then it attempts the alloc/copy, but allowing for an alloc of a
heap down to the minimum size. (Again, perhaps not extremely likely,
but in a program that is using FFI and doing C-side allocations that
somewhat fragment the virtual address space, I could imagine a 1.25G
heap being unable to be (significantly) grown to a heap of size 2.75G,
but the alloc/copy succeeds with a heap of 1.3G.)
>> >> the mremap function is described as using the Linux page table scheme
>> >> to
>> >> efficiently change the mapping between virtual addresses and (physical)
>> >> memory pages. It's purpose is to be more efficient than allocating a
>> >> new
>> >> map and copying.
>> >
>> > If I could ...
>>
>> I guess my point is that the way that you indicate that you aren't
>> more efficient than alloc/copy is by not providing mremap. Everything
>> else in the generic implementation with attempting the in place expand
>> is more efficient; it is just the starting off with the alloc/copy
>> that doesn't seem to make sense.
>
> As already mentioned, the design goal for windows mremap was not efficiency,
> but to use more virtual address space. I think both agree that alloc/copy as
> the last step makes more sense for 64-bit systems to reduce the likelihood
> of thrashing.
I guess the question is whether the large alloc/copy is really more
efficient on even a 32-bit system.
More information about the MLton
mailing list