[MLton] Cygwin->Mingw32: patch + future

Stephen Weeks MLton@mlton.org
Tue, 23 Nov 2004 11:26:11 -0800


> > One problem that I see is where do we put the -target-{cc,link}-opt
> > options that currently live in mlton-script before the
> > lib/mlton/<target> directories are created?
> 
> What's wrong with creating a big collection of directories with only default
> options? It's not like disk space is an issue. Besides, it's advertising. ;)

I think there are too many targets for differing architectures that we
would use the same options on, and I'd rather gather the default
options in one place than have them duplicated across multiple
similar targets.  Perhaps a single shell script called
'default-target-opts' or somesuch that takes a target name and spits
out our best guess for the mlton options.  Then, the Makefile and
bin/add-cross script could use default-target-opts to fill in the
appropriate file in lib/self and lib/<target>.

> BTW. I've attached another patch for cross-compiling.
> This time from linux->mingw (was testing it).
> 
> When picking the temp dir to use during build, you should use the 
> compiler's host to pick the dir---not the target. (cross-compile-bug.patch)

Patch applied.  Thanks.

> > > I propose to remove all the text/binary toggles out of the
> > > SML library and put all MLton filedescs in binary mode. Then add translation
> > > at the level of TextIO.
...
> I'd like to keep this issue seperate from the pipe issue I'm working on now
> if you don't mind. I'll look into this after the MLton.Child is perfected.

Yes, it is a good idea to keep this separate.  As Matthew pointed out,
there are lots of issues connected with just this that need to be
resolved.

> What's the problem? You shouldn't be requiring contiguous address
> space anyways;

It is unfortunate, but we do.  It would require a major runtime
rewrite to fix this.  And it's not clear it's worth the effort.

> that's never going to be possible to achieve portably. 

I'm not sure what problem you're referring to.  We haven't had any
problems getting a contiguous chunk on our various platforms (except
that Cygwin's mmap doesn't give as much as it could).

> Also, I've rebuilt mlton several times now using the mlton with mmap
> instead of VirtualAlloc. Are there really any bigger projects than
> MLton itself?

Yes.  And I've even experienced problems with compiling MLton due to
exactly this.

> What sort of problems does this cause?
> I see random segfaults of the mlton compiler under cygwin, but I had those
> with the VirtualAlloc version too. I haven't spent time tracking it yet.

I haven't seen those and would be interested to track them down too,
at least until we find it's YACB (yet another Cygwin bug).

> > Finally, for cases like toTextIOin, where you use TextIO.instream as a
> > phantom type, I prefer to use a new phantom type, since that makes it
> > clear that there is no connection between the TextIO.instream on the left
> > and on the right of the arrow.
...
> 1. No connection?
>    It seems to me to be a great deal more specific to say 'this thing was
>    used as a TextIO.instream so you can only get it out that way later'.
>    What is 'text'? If you read between the lines it's TextIO.*stream.
>    I think we should rather be upfront about what it is bound to.

You are right.  I was confused.  I was thinking that the
{Bin,Text}IO.{in,out}stream type arguments were phantom types, but
they are not, as one can see from the str datatype.

    datatype 'a str = FD of PFS.file_desc | STR of 'a * ('a -> unit)

> As a scary thought before I go to sleep, I wonder if this is a good
> idea at all? Just because Unix.* did it certainly doesn't mean it's
> right. It might make sense to want to open a pipe as WideTextIO,
> BinIO, and TextIO all at once and read things from one then the
> other. If the buffering is shared like we discussed earlier, this is
> well-defined.

I had this thought as well.  I agree that we're not clear on what
properties we would like of the extracted streams and that Unix.*
doesn't do it very well.  Given that, why don't we drop the streams
entirely, and go to a pure file-descriptor approach?  We can still
have a notion of direction to make sure that the right direction of
file descriptor is used in various places.

----------------------------------------------------------------------
signature DIRECTIONAL_FILE_DESC =
   sig
      type 'a t (* 'a is input or output *)

      (* phantom types *)
      type input (* the file desc can be turned into an instream *)
      type output (* the file desc can be turned into an outstream *)
	 
      val fd: 'a t -> Posix.FileSys.file_desc
   end

signature MLTON_CHILD =
   sig
      type t
      type signal

      structure FileDesc: DIRECTIONAL_FILE_DESC
      type input = FileDesc.input
      type output = FileDesc.output
	 
      structure Param:
	 sig
	    type 'a t (* 'a is input or output *)

	    val fd: 'a FileDesc.t -> 'a t
	    val file: string -> 'a t
	    val null: 'a t
	    val pipe: 'a t
	    val self: 'a t
	 end
      
      val create: {args: string list, 
		   env: string list option, 
		   path: string, 
		   stderr: output Param.t,
		   stdin: input Param.t,
		   stdout: output Param.t} -> t
      val getStderr: t -> input FileDesc.t
      val getStdin: t -> output FileDesc.t
      val getStdout: t -> input FileDesc.t
      val kill: t * signal -> unit
      val reap: t -> OS.Process.status
   end

functor Test (S: MLTON_CHILD) =
struct

open S

val p = create {args = ["-l", "/tmp"],
		env = NONE,
		path = "e:\\windows\\foo.exe",
		stderr = Param.self,
		stdin = Param.null,
		stdout = Param.pipe}

val q = create {args = ["--", "2"],
		env = NONE,
		path = "e:\\msys\\1.0\\bin\\grep.exe",
		stderr = Param.self,
		stdin = Param.fd (getStdout p),
		stdout = Param.pipe}

val _ = print "Sucking stream\n"
val _ = TextIO.inputAll (MLton.TextIO.newIn (FileDesc.fd (getStdout q),
					     "<stream>"))
			 
val _ = print "Done sucking\n"
val _ = reap p (* Posix.Signal.term *)
val _ = print "Reaped\n"
val _ = reap q (* Posix.Signal.term *)
val _ = print "Reaped\n"

end
----------------------------------------------------------------------

We can later figure out if/how to hook up directional file descriptors
to some kind of abstraction that extracts streams from them.

> One last thing: the MLton mailing lists use a very bad web email
> archiver.  There's no way to download attachments or search it! At
> the risk of sounding biased, may I suggest lurker.sf.net? 
> <http://people.debian.org/~terpstra>

Actually, the google search that we use searches the entire site,
including the mailing list archives, so it's not totally unsearchable.
I find that google search followed by browsing the thread index
usually gives me what I want.  It is unfortunate that the mailing list
archives don't look like the rest of the site and don't make the
search function accessible.  I have somewhere on my todo list to hack
mailman to put our header and search box at the top of each page.

In general, it is preferable to have a single integrated search for a
whole site than to have a scoped search that restricts only to a
portion of the site.  It would be a shame for someone to search for a
solution in the mailing lists only to miss their answer because it's
in the wiki.  Although, it is true that there are some cases where a
knowledgeable user would be able to take advantage of the extra
information (author, date range, ...) to do a more effective mail
search.  It's just that the default should be whole-site search, and
it should be very clear when a search is scoped.

Anyways, I'm not opposed to a more powerful mail archiver/searcher,
I'm just not convinced of its benefits.  Lurker looks nice.  If you're
willing to set it up, I'd be happy to make available all our old mail
and subscribe lurker to the lists, so people can try it out.