[MLton] switching to handler caught threads
Matthew Fluet
fluet@cs.cornell.edu
Wed, 31 Mar 2004 21:23:43 -0500 (EST)
I've tracked down a problem to an interesting (and I presume buggy)
interaction between the threads caught by the signal handler and other
threads. Note that the thread the ML signal handler gets its hands on
is built like:
fun fromPrimitive (t: Prim.thread): unit t =
T (ref (Paused
(fn f => ((atomicEnd (); f ())
handle _ =>
die "Asynchronous exceptions are not allowed.\n"),
t)))
I don't understand the purpose of the atomicEnd(). Recall the definition
of switching:
fun ('a, 'b) atomicSwitch' (f: 'a t -> 'b t * (unit -> 'b)): 'a =
if !switching
then (atomicEnd ()
; raise Fail "nested Thread.switch")
else
let
val _ = switching := true
val r : (unit -> 'a) ref =
ref (fn () => die "Thread.atomicSwitch' didn't set r.\n")
val t: 'a thread ref =
ref (Paused (fn x => r := x, Prim.current ()))
fun fail e = (t := Dead
; switching := false
; atomicEnd ()
; raise e)
val (T t': 'b t, x: unit -> 'b) = f (T t) handle e => fail e
val primThread =
case !t' before (t' := Dead; switching := false) of
Dead => fail (Fail "switch to a Dead thread")
| New g => newThread (g o x)
| Paused (f, t) => (f x; t)
val _ = Prim.switchTo primThread
val _ = atomicEnd ()
in
!r ()
end
We assume that this function is called with canHandle > 0. In particular,
it should be callable with canHandle == 1. Now, suppose the signal
handler has caught a thread (built using fromPrimitive) and pushes it on a
ready queue. Later, we decide to switch to that thread. If we enter
atomicSwitch' with canHandle == 1, then as we evaluate the expression for
primThread, we're still at canHandle == 1. But, the atomicEnd prepended
by fromPrimitive gets run under the Paused branch, dropping us to
canHandle == 0. Now, the signal handler gets a chance to run (although it
shouldn't be running here), before we switch to primThread and before we
leave the atomicEnd. Now, we're in a totally screwy state. In
particular, both this atomicSwitch' and the signal handler both have a
hold of the current thread, which is bad.