I/O speed

Stephen Weeks MLton@sourcelight.com
Thu, 21 Sep 2000 16:28:49 -0700 (PDT)

OK.  I am suitably embarrassed.  I fixed TextIO.

Executive Summary:

Here are the numbers for gcc and MLton on my machine for code that reads 1G from 

	time(s)	M/s
MLton	32.39	30
gcc	26.73	37


I fixed the basis library implementation of TextIO.input1 so that the common
case (i.e. the character is in the I/O buffer) is fast.  Here's the comparison
of gcc and MLton.

Here is the C code.

/* foo.c */
#include <stdio.h>

int main() {
	int ch, size;

     for (;;) {
             ch = getchar();
             if (ch == EOF)
/* end foo.c */

I compiled foo.c with the following:
  gcc -O2 -o foo -D_IO_getc=_IO_getc_unlocked -D_IO_putc=_IO_putc_unlocked foo.c

I then tested it with the following:
  dd bs=1k count=1000000 </dev/zero | time foo

The system + user was 26.73 seconds.

Spy showed the loop as the following:

0x80483f3:      movl   0x8049550,%edx
0x80483f9:      movl   0x4(%edx),%eax
0x80483fc:      cmpl   0x8(%edx),%eax
0x80483ff:      jb     0x8048410
0x8048410:      movzbl (%eax),%eax
0x8048413:      incl   0x4(%edx)
0x8048416:      cmpl   $0xffffffff,%eax
0x8048419:      jne    0x80483f3

Now, for the MLton version (with the new text-io.sml).

(* z.sml *)
open TextIO
val ins = stdIn
fun loop() =
   case input1 ins of
      NONE => ()
    | SOME _ => loop()
val _ = loop()
(* end z.sml *)

Compiled with "$HOME/mlton/bin/mlton z.sml"

Tested with "dd bs=1k count=1000000 </dev/zero | time z"

The system + user was 32.39 seconds.

Spy showed the loop as the following:

0x804a830:      movl   0x1c(%ebx),%eax
0x804a833:      movl   (%eax),%edx
0x804a835:      movl   0x805096c,%eax
0x804a83a:      movl   (%eax),%edi
0x804a83c:      movl   0x8050970,%eax
0x804a841:      cmpl   (%eax),%edi
0x804a843:      jge    0x804aadc
0x804a849:      movl   0x805096c,%eax
0x804a84e:      leal   0x1(%edi),%ecx
0x804a851:      movl   %ecx,(%eax)
0x804a853:      cmpl   $0xfff,%edi
0x804a859:      jbe    0x804a830