[MLton] A few changes proposals for MLton

Matthew Fluet fluet at tti-c.org
Fri Nov 14 15:57:11 PST 2008


On Thu, 13 Nov 2008, Nicolas Bertolotti wrote:
> *   18-fix-big-heap-disk-dump.patch
>    *   This patch fixes a crash I could observe when a very large heap
>      is paged to disk. It seems that in some cases, fwrite() fails to
>      dump the whole heap at once and we have to split.

I encountered this as well on x86-linux; an
fwrite (data, size, count, file) with size * count >= 2G immediately 
returns 0 without writing any data and without setting ferror or errno. 
(I.e., there is no indication as to why the operation failed.)
This doesn't happen on x86-darwin or on amd64-linux (though, I didn't try 
size * count >= 2^63 (which ought to fail anyways, as the memory system 
and file system can't take it)).  Nor is there a corresponding problem 
with fread.  This is rather annoying, as fwrite is supposed to be the 
abstraction that removes the need for looping writes.

With regards to the patch:

--- mlton-r6698.patched/runtime/util/safe.h	2008-09-09 10:54:08.000000000 +0200
+++ mlton-r6698/runtime/util/safe.h	2008-08-13 15:07:57.000000000 +0200
@@ -68,8 +68,21 @@
  static inline void fwrite_safe (const void *buf, size_t size, size_t count,
                                  FILE *f) {
    size_t res;
-
-  res = fwrite (buf, size, count, f);
+  enum {
+    WRITE_CHUNK_SIZE = 0x2000000, /* 32M */
+  };
+
+  res = 0;
+  while (res < count) {
+    size_t n;
+    n = count - res;
+    if (n > WRITE_CHUNK_SIZE)
+      n = WRITE_CHUNK_SIZE;
+    n = fwrite ((void *) ((char *) buf + res), size, n, f);
+    if (n == 0)
+      break;
+    res += n;
+  }
    if (res != count)
      diee ("fwrite (_, %"PRIuMAX", %"PRIuMAX", _) failed "
            "(only wrote %"PRIuMAX").\n",


This kind of misses the point of the *_safe functions --- they are 
supposed to signal errors, not mask them.  The specification of fwrite 
indicates that any short object count return value is an error.  Also, the 
code above only handles a large count, not a large size.  So, it won't 
help if one makes the seemingly meaning preserving change (in 
diskBack.unix.c) from:
   fwrite_safe (buf, 1, size, f);
to
   fwrite_safe (buf, size, 1, f);

I would isolate the extreme call to fwrite_safe as in this alternative 
patch.  Although fread doesn't fail on large reads, it seems prudent to 
limit extreme I/O; also, it makes the program more responsive to ctrl-C 
(as control frequently returns to user-mode).
-------------- next part --------------
diff --git a/runtime/platform/diskBack.unix.c b/runtime/platform/diskBack.unix.c
index f9df82d..95bcea0 100644
--- a/runtime/platform/diskBack.unix.c
+++ b/runtime/platform/diskBack.unix.c
@@ -34,9 +34,21 @@ typedef struct {
 void GC_diskBack_read (void *data, pointer buf, size_t size) {
   FILE *f;
 
+  const size_t READ_CHUNK_SIZE = 0x2000000; /* 32M */
+
   f = ((WriteToDiskData)data)->f;
   fseek_safe (f, 0, SEEK_SET);
-  fread_safe (buf, 1, size, f);
+  /* fread (_, 1, size, _) succeeds
+   * with size >= 2^31
+   * for a 32-bit executable on 64-bit linux.
+   * Nonetheless, match GC_diskBack_write.
+   */
+  while (size > 0) {
+    size_t s = min (READ_CHUNK_SIZE, size);
+    fread_safe (buf, 1, s, f);
+    buf += s;
+    size -= s;
+  }
 }
 
 void GC_diskBack_close (void *data) {
@@ -51,8 +63,20 @@ void *GC_diskBack_write (pointer buf, size_t size) {
   FILE *f;
   WriteToDiskData d;
 
+  const size_t WRITE_CHUNK_SIZE = 0x2000000; /* 32M */
+
   f = tempFileDes ();
-  fwrite_safe (buf, 1, size, f);
+  /* fwrite (_, 1, size, _) fails
+   * (with no helpful error conditions!)
+   * with size >= 2^31
+   * on x86-linux.
+   */
+  while (size > 0) {
+    size_t s = min (WRITE_CHUNK_SIZE, size);
+    fwrite_safe (buf, 1, s, f);
+    buf += s;
+    size -= s;
+  }
   d = (WriteToDiskData)(malloc_safe (sizeof(*d)));
   d->f = f;
   return d;


More information about the MLton mailing list