From matthew.fluet at gmail.com Wed Jan 6 06:50:40 2010 From: matthew.fluet at gmail.com (Matthew Fluet) Date: Wed Jan 6 06:51:16 2010 Subject: [MLton] target-map considered harmful In-Reply-To: <162de7480912150421m38f30c2cx75d1bab647307470@mail.gmail.com> References: <162de7480912150421m38f30c2cx75d1bab647307470@mail.gmail.com> Message-ID: On Tue, Dec 15, 2009 at 7:21 AM, Wesley W. Terpstra wrote: > So, I've use a lot of MLton cross compilers and have been looking at > how to get MLton packages that achieve this. I was hoping in the > future to be able to type: > ?apt-get install mlton mlton-mingw32 mlton-mipsel > ... and then get a mlton capable of compiling to my native system, > mingw32, or mipsel targets. > > This seems quite easy to achieve, except the annoying target-map file > is one file. Adding the needed sml/basis/config/c/xxx/c-types.sml and > xxx/*.a files in the mlton-mingw32 package is no problem. However, I > can't just overwrite target-map. > > Couldn't we instead have a personality file inside the target folders? eg: > > cat self/personality > i386 linux > cat i586-mingw32msvc/personality > i386 mingw That seems like a fine solution. I would suggest that we put all of the targets within a single "target" directory of the lib directory. That would make it easy to find all the targets. It might also be good to move the c-types.sml file into the target-specific directory. Then a target would be entirely self contained. From matthew.fluet at gmail.com Wed Jan 6 07:10:57 2010 From: matthew.fluet at gmail.com (Matthew Fluet) Date: Wed Jan 6 07:11:31 2010 Subject: [MLton] MLton.hash deeply flawed In-Reply-To: <162de7480911011606v6bf45c29v518f45b8620dda7c@mail.gmail.com> References: <162de7480911011606v6bf45c29v518f45b8620dda7c@mail.gmail.com> Message-ID: On Sun, Nov 1, 2009 at 7:06 PM, Wesley W. Terpstra wrote: > Strings that differ in only a few places don't get unique hash values > from MLton.hash. In a program where I tried to use MLton.hash I had 38 > collisions out of 8325 distinct input strings. Not good. > > val x = "klahjflaskjflaksjfgklajsglkasjglaksjglaksjglaksgjaklsgaslkgjaslgkjas" > val y = "klahjflbskjflaksjfgklajsglkasjglaksjglaksjglaksgjaklsgaslkgjaslgkjaS" > > val () = print (Word32.toString (MLton.hash x) ^ "\n") > val () = print (Word32.toString (MLton.hash y) ^ "\n") A late reply, but the rationale was that the application that prompted the introduction of MLton.hash required a constant time hash function. So, MLton.hash only looks at structures to a fixed depth (default 16) and samples vectors. A complete, linear time, hash would be useful too. From matthew.fluet at gmail.com Wed Jan 6 12:29:37 2010 From: matthew.fluet at gmail.com (Matthew Fluet) Date: Wed Jan 6 12:30:14 2010 Subject: [MLton] max-heap setting for 64-bit applications In-Reply-To: <162de7480912141050q71e796b4vf7e2e0992a1e7a89@mail.gmail.com> References: <4B1FD8BE.2090801@reactive-systems.com> <4B228C5B.9010506@reactive-systems.com> <162de7480912111617o66774228l32a655ec5776395f@mail.gmail.com> <162de7480912120524m3d27e854x333460efc628fb79@mail.gmail.com> <162de7480912140945rb4a0ad3x57a5dcf99dad2c92@mail.gmail.com> <162de7480912141050q71e796b4vf7e2e0992a1e7a89@mail.gmail.com> Message-ID: On Mon, Dec 14, 2009 at 1:50 PM, Wesley W. Terpstra wrote: > On Mon, Dec 14, 2009 at 6:53 PM, Matthew Fluet wrote: >> I was thinking about the 32-bit case, in which case fragmenting the 4G >> VM isn't terribly difficult. > > Ok, then let me address your original comment again: >> But, the Windows specific mremap could still fail --- note that the >> growHeap function demands significant growth from mremap. ?If that >> fails, then it attempts the alloc/copy, but allowing for an alloc of a >> heap down to the minimum size. > > This sounds like a bad idea, then. The problem applies equally to all > platforms (even linux). > > So, to be clear, we're talking about this situation: > memory is available for the minimum size > that memory is NOT large enough to accommodate "significant growth" > the memory region contains the current mapping > > Under this case, AFAICT, every system fails to grow the heap, though > it should succeed. Both windows and linux would've been fine if the GC > had attempted to use mremap backed off to a lower value. > > I submit that this is a bug which should be fixed. Agreed, although the trade-offs are not clear cut. There was a bit of churn in the heap resizing code in response to the thread started at: http://mlton.org/pipermail/mlton/2008-April/030230.html It's worth reading through that thread, as it describes some interesting behaviors and some of the pros/cons: http://mlton.org/pipermail/mlton/2008-May/030265.html -- should we shrink a heap before attempting to allocate a larger heap? http://mlton.org/pipermail/mlton/2008-July/030285.html -- mremap (linux) can't always get as much memory as an unmap/mmap (ie., using page to disk) In any case, I remember that one problem with the previous code was that the backoff scheme in remapHeap and createHeap used a linear scheme. That is, it backed off by (desiredSize - minSize) / 16 each iteration, and then, as a last resort, tried minSize. When approaching an out-of-memory situation, we would tend to have a minSize that was just 4K larger than the current heap size and a very large desiredSize (because there is lots of live data) that was unattainable, but was also so large that we couldn't mremap to minSize + (desiredSize - minSize) / 16. So, as a last resort, we would mremap to minSize = curSize + 4K --- gaining us all of 4K as a result of the garbage collection. And we'd very quickly be back in the garbage collector where it would happen all over again. Eventually, we would move up by 4K increments until mremap couldn't succeed and we'd either page to disk or die with out-of-memory. That prompted the idea of "significant growth" in growHeap when invoking remapHeap. http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=6783 Of course, this could just shift the problem to createHeap after a page to disk, again not hitting the actual maximum available and instead hitting minSize. Some time later, I switched the backoff scheme to use a logarithmic(?) scheme. http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=7057 We would use more iterations, because we would shrink the amount of backoff as we approached the minSize. This makes the eventual successful mremap or mmap much closer to the maximum size that could be successfully mremap-ed or mmap-ed. This has much better "performance": on a memory-leak program, I would get an out-of-memory error after about 5 minutes and two or three pagings to disk, rather than after hour(s?) of 3G garbage collections that obtained 4K increases in heap size and numerous pagings to disk. Of course, this would also help with the original issue that prompted the need for "significant growth". Perhaps the adaptive backoff with mremap would work satisfactorily with the true minSize. From wesley at terpstra.ca Thu Jan 7 14:36:34 2010 From: wesley at terpstra.ca (Wesley W. Terpstra) Date: Thu Jan 7 14:36:40 2010 Subject: [MLton] target-map considered harmful In-Reply-To: References: <162de7480912150421m38f30c2cx75d1bab647307470@mail.gmail.com> Message-ID: <162de7481001071436y30bf0afbq34fd61a24665c0ff@mail.gmail.com> On Wed, Jan 6, 2010 at 3:50 PM, Matthew Fluet wrote: > That seems like a fine solution. ?I would suggest that we put all of > the targets within a single "target" directory of the lib directory. > That would make it easy to find all the targets. > > It might also be good to move the c-types.sml file into the > target-specific directory. ?Then a target would be entirely self > contained. I've attached a patch which does this and committed two orthogonal (but necessary) changes to svn/HEAD. As you can see in the patch, I've moved the target directories into a 'targets' sub-folder in the mlton lib directory. The OS and Architecture are listed in the files 'os' and 'arch' respectively in the appropriate target folder. Finally, I moved c-types.sml into an 'sml' folder for the given target. The directory layout looks now like: terpstra@orange:~/mlton/build/lib$ find targets/ targets/ targets/self targets/self/sml targets/self/sml/c-types.sml targets/self/include targets/self/include/c-types.h targets/self/arch targets/self/libgdtoa.a targets/self/libgdtoa-pic.a targets/self/constants targets/self/libmlton.a targets/self/sizes targets/self/libgdtoa-gdb.a targets/self/os targets/self/libmlton-pic.a targets/self/libmlton-gdb.a Most of the changes were to the Makefile. -------------- next part -------------- A non-text attachment was scrubbed... Name: targets.patch Type: text/x-diff Size: 9949 bytes Desc: not available Url : http://mlton.org/pipermail/mlton/attachments/20100107/d0d05715/targets.bin From matthew.fluet at gmail.com Mon Jan 11 10:49:02 2010 From: matthew.fluet at gmail.com (Matthew Fluet) Date: Mon Jan 11 10:49:36 2010 Subject: [MLton] target-map considered harmful In-Reply-To: <162de7481001071436y30bf0afbq34fd61a24665c0ff@mail.gmail.com> References: <162de7480912150421m38f30c2cx75d1bab647307470@mail.gmail.com> <162de7481001071436y30bf0afbq34fd61a24665c0ff@mail.gmail.com> Message-ID: On Thu, Jan 7, 2010 at 5:36 PM, Wesley W. Terpstra wrote: > On Wed, Jan 6, 2010 at 3:50 PM, Matthew Fluet wrote: >> That seems like a fine solution. ?I would suggest that we put all of >> the targets within a single "target" directory of the lib directory. >> That would make it easy to find all the targets. >> >> It might also be good to move the c-types.sml file into the >> target-specific directory. ?Then a target would be entirely self >> contained. > > I've attached a patch which does this and committed two orthogonal > (but necessary) changes to svn/HEAD. > > As you can see in the patch, I've moved the target directories into a > 'targets' sub-folder in the mlton lib directory. The OS and > Architecture are listed in the files 'os' and 'arch' respectively in > the appropriate target folder. Finally, I moved c-types.sml into an > 'sml' folder for the given target. The directory layout looks now > like: > > terpstra@orange:~/mlton/build/lib$ find targets/ > targets/ > targets/self > targets/self/sml > targets/self/sml/c-types.sml > targets/self/include > targets/self/include/c-types.h > targets/self/arch > targets/self/libgdtoa.a > targets/self/libgdtoa-pic.a > targets/self/constants > targets/self/libmlton.a > targets/self/sizes > targets/self/libgdtoa-gdb.a > targets/self/os > targets/self/libmlton-pic.a > targets/self/libmlton-gdb.a > > Most of the changes were to the Makefile. Looks very good. My only suggestion might be to change the ../../../targets/$(TARGET)/sml/c-types.sml to $(LIB_MLTON_DIR)/targets/$(TARGET)/sml/c-types.sml just because it seems more natural to think of the targets directory as relative to the lib/mlton directory than relative to the basis library directory. Also, it is helpful to be able to type-check the basis library from within the /basis-library directory without installing it into the build/lib/sml/basis directory.