[MLton] Cygwin->Mingw32: patch + future

Stephen Weeks MLton@mlton.org
Mon, 22 Nov 2004 15:37:06 -0800

> In hindsight, I think if there was an option '-target-cc foo "gcc
> -mno-cygwin"' that would be enough. However, I think that keeping these
> settings in the mlton script is the wrong place. Target compilation
> options should be stored in the usr/lib/mlton/target/ directory. If the
> mlton script sourced files from there, that would be ideal.

This looks like a good idea, if we can work around a problem or two.
It does let us specify cc-opt, and link-opt, as well as other options
on a per-target basis with more flexibility than the current approach.

One problem that I see is where do we put the -target-{cc,link}-opt
options that currently live in mlton-script before the
lib/mlton/<target> directories are created?  Most people only have
lib/mlton/self.  It's only after installing a cross compiler that
these flags are added.  I guess we could change the add-cross script
so that when it creates lib/mlton/<target> it installs a default set
of flags?

> If a -target-cc was added, it must support multiple words (unlike the
> current -cc option). The default for a target 'bar' should be
> "$(cc) -b bar" where $(cc) is the setting of -cc for 'self'.

With your approach, I guess we don't need -target-cc, since instead we
would have an appropriate -cc flag in lib/mlton/<target>/flags.  And
maybe we don't even need multiple words, since we can make the first
-cc-opt be the appropriate -b or -mno-cygwin.  And MLton will preserve
the order of the cc options.

> I actually ended up adding it to MLTON_CHILD.
> The reason was that in basis-extra.mlb the ordering prevents it from
> working. A proper implementation similar to system/unix.sml needs the signal
> mask stuff which is in signal.sml. But the dependency
> signal->thread->exn->process prevents me from putting the code there.

OK for now.  I'll revisit this when we're ready to checkin.

> > * What is RedirectionFailure for?
> It was supposed to be if it failed to create the PIPE or open the FILE.
> Right now I just let the exceptions trickle through. I am not sure what the
> best solution is; I am unused to using exceptions for error-handling...

For now let's let the underlying exceptions trickle through.

> > * I wonder if it makes sense to change the semantics from how the
> > Unix module behaves so that, e.g., two calls to textStdinOf return the
> > same stream instead of creating a new one (which has no reasonable
> > behavior).
> Yes! Please!

Let's doit then.  We are not so constrained in defining our own

> I propose to remove all the text/binary toggles out of the
> SML library and put all MLton filedescs in binary mode. Then add translation
> at the level of TextIO. More work to be done in the MLton basis library,
> true. However, this will probably be more portable in the long run and
> makes several otherwise impossible problems solvable
> Make both TextIO.stdin and BinIO.stdin point to the same underlying buffer
> and bingo, it finally works the way C should have. If you combine this with
> your above suggestion of returning the same stream, one can even guarantee
> that reads/writes to the same fd are not order-mangled by the buffers.

This seems like a fine idea to me.  I am interested to hear others'
thoughts though.

> > Here's a MLtonization of your signature, with my proposed changes.
> > 	    val file: string -> 'a isNotPipe t
> > 	    val inherit: 'a isNotPipe t
> > 	    val lie: 'a isNotPipe t -> 'a isPipe t
> > 	    val null: 'a isNotPipe t
> > 	    val pipe: 'a isPipe t
> Uhm, "lie"?
> That sounds like a dangerous option.
> What's it for?

The idea was that one might want to write a piece of code (e.g. a
shell) that dynamically decides whether or not an argument to create
is a pipe.  In that case, we must have a common supertype that covers
both types.  My approach was to allow the client to lie about the type
by treating an isNotPipe as an isPipe and having the get* functions
raise exceptions.  This was a bad idea.  This can be done in a more
type safe manner.  I've included a new approach in the code below.

> The only 'unsafe' part of my new interface is that you can pass the stdout
> of one process into the stdin of more than one process. This gives an
> exception. I think it is impossible to prevent this with the type system
> though, right?


> To keep cygwin on the level of mingw, I've switched cygwin to using
> fork() and mmap(). For me, it appears to work (I have the newest
> cygwin). The change to enable cygwin+fork() is currently part of my
> patchset.

Unfortunately, using fork+mmap is still not acceptable on Cygwin in
some cases.  See

> I intend to figure out how to run the regression tests today.
> However, I only have access to cygwin, mingw32, and linux.

That's fine.  We test the other platforms sporadically, and I will try
to remember run some tests after the checkin.

> I've attached the current signature for your approval.

It makes sense.  I think it can be cleaned up a little.  For example,
the types of create, kill and reap are unnecessarily complicated.
Also, the types of getStd{err,in,out} can be simplified.  Finally, for
cases like toTextIOin, where you use TextIO.instream as a phantom
type, I prefer to use a new phantom type, since that makes it clear
that there is no connection between the TextIO.instream on the left
and on the right of the arrow.

Incorporating all those changes, plus grouping things in the MLton
style and renaming a little, here's my latest proposal.  This proposal
also includes an unknown type, and functions Child.dest and
Param.forget as an attempt at solving the "lie" problem mentioned

signature MLTON_CHILD =
      type ('stdin, 'stdout, 'stderr) t
      type signal

      (* Use phantom types to:
       * 1. ensure that a stream is extracted only if the parameter was a pipe.
       * 2. Ensure that a pipe is only extracted in one way: binary stream,
       *    chain, file descriptor, or text stream.
       * 3. Ensure that the extracted stream and param streams have the right
       *    direction. 
      type bin
      type chain
      type fd
      type text
      type input
      type output
      type 'a isPipe (* 'a will be bin, chain, fd, or text *)
      type isNotPipe
      type unknown
      structure Child:
	    type ('a, 'b) t

	    datatype ('a, 'b) dest =
	       IsNotPipe of (isNotPipe, 'b) t
	     | IsPipe of ('a isPipe, 'b) t
	    val dest: (unknown, 'b) t -> ('a, 'b) dest
	    val toBinIn: (bin isPipe, input) t -> BinIO.instream
	    val toBinOut: (bin isPipe, output) t -> BinIO.outstream
	    (* toFD is not necessarily available on all systems.  It may raise
	     * an exception.
	    val toFD: (fd isPipe, 'b) t -> Posix.FileSys.file_desc
	    val toTextIn: (text isPipe, input) t -> TextIO.instream
	    val toTextOut: (text isPipe, output) t -> TextIO.outstream

      structure Param:
	    type ('a, 'b) t

	    (* {child,fd} close their parameter when create is called.
	     * Therefore, they may only be used once.  fd is not necessarily
	     * available on all systems.  It may raise an exception.
	    val child: (chain isPipe, 'b) Child.t -> (isNotPipe, 'b) t
	    val fd: Posix.FileSys.file_desc -> (isNotPipe, 'b) t
	    val file: string -> (isNotPipe, 'b) t
	    val forget: ('a, 'b) t -> (unknown, 'b) t
	    val null: (isNotPipe, 'b) t
	    val pipe: ('a isPipe, 'b) t
	    val self: (isNotPipe, 'b) t
      val create:
	 {args: string list, 
	  env: string list option, 
	  path: string, 
	  stderr: ('stderr, output) Param.t,
	  stdin: ('stdin, input) Param.t,
	  stdout: ('stdout, output) Param.t}
	 -> ('stdin, 'stdout, 'stderr) t
      val getStderr: ('a, 'b, 'c) t -> ('c, input) Child.t
      val getStdin: ('a, 'b, 'c) t -> ('a, output) Child.t
      val getStdout: ('a, 'b, 'c) t -> ('b, input) Child.t
      val kill: ('a, 'b, 'c) t * signal -> unit
      val reap: ('a, 'b, 'c) t -> OS.Process.status

Here's your example translated to use this new signature.

functor Test (S: MLTON_CHILD) =

open S

val p = create {args = ["-l", "/tmp"],
		env = NONE,
		path = "e:\\windows\\foo.exe",
		stderr = Param.self,
		stdin = Param.null,
		stdout = Param.pipe}

val q = create {args = ["--", "2"],
		env = NONE,
		path = "e:\\msys\\1.0\\bin\\grep.exe",
		stderr = Param.self,
		stdin = Param.child (getStdout p),
		stdout = Param.pipe}

val _ = print "Sucking stream\n"
val _ = TextIO.inputAll (Child.toTextIn (getStdout q))
val _ = print "Done sucking\n"
val _ = reap p (* Posix.Signal.term *)
val _ = print "Reaped\n"
val _ = reap q (* Posix.Signal.term *)
val _ = print "Reaped\n"