[MLton] interrupted system call

Tue, 23 Mar 2004 13:55:15 -0500 (EST)

> > I don't know exactly what MLton does, but at the OS level, it depends on what
> > the sa_flags of the struct sigaction is set to.  If you want system calls  to
> > be  restarted  (after the signal handler is called) then you should set it to
> > SA_RESTART.  If you don't then as I recall Linux is more system-V-ish,  which
> > means that the system call fails with an errno as you see it.
>
> We had a discussion about this back in July 2000
>
> 	http://www.mlton.org/pipermail/mlton/2000-July/

Thanks for the pointer.

> The discussion explains that we don't want to use SA_RESTART because
> it will prevent the SML signal handler from running, and that's what
> we ended up doing.  I also explained what we should do as a
> consequence.
>
>     I think this problem could be alleviated by having all system
>     calls restarted at the ML level, within the basis library code.
>     After every system call, the library code would then check errno,
>     and if it's EINTR, would run the ML signal handler with a current
>     thread that knows to restart the system call.

O.k., so that suggests something like:

Posix.Error.restart : ('a -> 'b) -> 'a -> 'b

fun restart f x =
  (f x) handle (exn as SysError (_, SOME serr)) =>
        if serr = intr then restart f x else raise exn

(Or maybe something more primitive that directly checks the error, rather
than wrapping something that raises SysErr.)

In any case, since there is a loop in the CFG, the backend should put in
the limit-check that gets triggered when signals are being handled.

On the other hand, if signals aren't being handled in ML, then what
happens to the signal?  I guess the process is terminated?

Another issue is critical regions.  If the system call is burried down in
the basis, and I try to wrap a higher-level function in atomicBegin/End,
then the ML signal handler won't get run in the restart loop.  For some
signals, maybe this is o.k. (keep making progress until we leave the
critical region), but for the interval timer, this is bad.  Because if the
system call is taking an appreciable about of time, then the signal is
likely to be raised again when the call is restarted.

Maybe the right thing is to put a canHandle check in the restart loop, and
if we are in a critical region, then just let the exception propagate.

Another option might be to have the C signal handler block signals until
the ML signal handler gets a chance to run.