[MLton] Henry's comments on the User Guide
Stephen Weeks
MLton@mlton.org
Tue, 10 Feb 2004 04:20:58 -0800
> In the `Installation section on page 3 the missing leading `/' in the file
> names is confusing. I know that this is to indicate that it is relative
> to where you install MLton, but it isn't clear. Also the statement that
> it installs `in root' is confusing. It installs things under root, but
> not at the top level.
I changed the text to
MLton runs on a variety of platforms and is distributed in
both source and binary form. The format for the binary package
depends on the platform. The binary package will install under
/usr or /usr/local, depending on the platform. If you install
MLton somewhere else, you must set the lib variable in the
bin/mlton script to the directory that contains the libraries
(/usr/lib/mlton by default).
> The example on page 3 of -link-opt is bad because unless you change
> /etc/ld.so.conf, /usr/lib is looked in by default.
What example directory would you suggest?
> The `supports the full SML 97 language' paragraph at the top of page 5 should
> mention (or at least refer) to the known deviations of MLton
I added a reference to the "bugs" section.
> In the `complete basis library' I think that it is worth mentioning that we
> track the new spec and that we include many optional things like IntInf.
> (Perhaps just my prejudice.)
Done.
> In the `excellent running times' section on page 5, I don't think that the
> shootout web page is alive any more.
Yeah, I dropped it.
> In the `unboxed native arrays' section on page 5, it might be worth
> mentioning here that monomorphic arrays are just arrays in MLton.
Done.
> In the `runtime system supports large arrays' on page 5, can we really handle
> arrays (of 1, 8 or 16 bit objects I assume) up to 2^31 - 1 now?
> Excellent.
Yes, I added this a while back. The only limitations now are RAM size
and address space fragmentation. I would love to see some testing of
very large arrays. Here's the extent of my testing :-)
open Array
val a = array (valOf Int.maxInt div 2, #"a")
val _ = update (a, 0, #"b")
val _ = if sub (a, 0) = sub (a, 1)
then raise Fail "bug"
else ()
> In the `standalone executables' section on page 5, you want to say:
>
> You don't need any thing except the a.out and `standard' shared libraries
> by default. (Here standard means not coming from MLton and already
> on most systems.)
>
> You can get away with just the a.out because MLton can generate
> statically linked executables if desired.
I changed it to
MLton generates standalone executables. No additional code or
libraries are necessary in order to run an executable, except
for the standard shared libraries. MLton can also generate
statically linked executables.
> In the `signal handlers' section on page 6, the use of the word `thread' is
> confusing. It isn't something like a POSIX thread, but a MLton
> construction. I don't know what exactly to say, but as is it makes it
> seem that MLton has `real' threads, which it does not.
I changed it to say "MLton thread" instead of thread.
MLton supports signal handlers written in SML. Signal
handlers run in a separate MLton thread, and have access to
the thread that was interrupted by the signal. Signal
handlers can be used in conjunction with threads to implement
preemptive multitasking.
I also changed the thread bullet point to read
MLton has support for its own threads, upon which either
preemptive or non-preemptive multitasking can be implemented.
At some point in the future, MLton will support CML.
> In the first paragraph of `Compile-time options' on page 8, what about .a and
> .so files?
These are not allowed as files on the command line. You can use
-link-opt to link with such files.
> In the `-cc-opt option' section on page 8, doesn't the option also get used
> for .c (and .s) files even in the native mode?
Only for .c files. You'r right about native. I've changed the text
to:
Pass the option to gcc when compiling C code.
> In the `-export-header' section on page 8 you should mention that setting it
> to true not only outputs a C header file, it also stops compilation after
> doing so. I.e., it does NOT do the compile.
Done.
> Actually that choice seems
> a bit strange since if compile time is huge then you have to do it twice
> (although I assume that the true case is quicker).
The true case only has to do elaboration, so it is fast enough.
> In the `-inline' section on page 8 you have to at least say something about
> the units, even if it is only that they are arbitrary. At the moment it
> doesn't even say that the threshold is a size threshold, or an very rough
> estimate of that.
I changed the text to
Set the inlining threshold used in the optimizer. The
threshold is an approximate measure of code size of a
procedure. The default is 320.
> In the `-runtime' section on page 9, you don't mention in discussing multiple
> uses that it is the LAST value of any parameter which dominates.
I added
If the same runtime switch occurs more than once, then the
last setting will take effect.
> Also
> you have to say that command line arguments (via @MLton) are processed
> AFTER -runtime ones so that the result of -runtime can be overridden.
That was already there. The text says that the -runtime argument
"will be processed before other @MLton command line switches".
> Speaking of these multiple things, it is completely wrong and bad that
> you can use multiple
> @MLton ... --
> in a single run. (Discussed on page 10 in section 4.2 `Runtime system
> options'.) That means that it is absolutely impossible to call a MLton
> executable with a first argument of `@MLton' and a later argument of
> `--'. In particular, it is not possible to pass arguments to a MLton
> executable unless you KNOW that they are NOT going to contain `@MLton'.
> If you only allow 0 or 2 `@MLton's then I can wrap a MLton executable in
> a shell script:
> exec mlton-executable @MLton -- "$@"
> which will guarantee that any command line arguments are simply passed to
> the actual ML code and not eaten by the runtime system.
I see the problem. I think the best solution is add a runtime switch,
"stop", which causes the runtime to stop once it reaches the next
"--". That way, you can even compile an executable with "-runtime
stop" and the executable won't process any @MLton arguments. Or, if
you want to do a shell script you can do
exec mlton-executable @MLton stop -- "$@"
> In the `-show-basis' and `show-basis-used' section on page 9, change
> `displays' to `prints to standard output' to be more clear and in sync
> with -export-header.
I made them all consistently use "print".
> Also it is rather confusing because -show-basis
> causes the types to be displayed (but NOT the basis library itself) while
> -show-basis-used just lists the things used, but not their types.
I changed them to use the same layout routine, so that all now
display the types.
> In connection with this, I would love to have an option which caused
> MLton to write out the types for my code (like -show-basis does for the
> basis). It would be a very convenient place to get a summary of some
> code.
Done. Try "mlton -show-basis true foo.sml".
> Note, for both of these options and also -export-header, it might make
> more sense to have them accept a file name instead of the true/false.
> This would allow them to be combined. Right now it seems that if you
> turn both -show-basis and -show-basis-used on then only the latter is
> done with no error message.
It's a mistake do two (or more) things that print to standard output.
So, I think they should stay true/false, but should report an error if
more than one is on. I've added a check that at most one of the
following is defined.
-export-header
-show-basis
-show-basis-used
> In the `-no-load-world' section on page 11 it is worth mentioning that the
> reason for this is just for set-uid programs.
Done. I wonder if no-load-world is almost unneeded now that one can
use -runtime stop?
> In the `-ram-slop' section on page 11 it is worth mentioning that x should
> probably be no more than 1 and that making it strictly less than 1 is to
> account for space used by the OS and other programs running at the same
> time.
Done.
> The fact that _import and _export introduce phrases which are expressions
> makes the choice of a trailing semicolon very bad. This must mean that,
> for example, in
> val z = _import "foo": real * char -> int;
> the semicolon is part of the expression, right?
Yes.
> I understand the need to have a terminator for these expressions because
> they would otherwise end with a type which would make parsing tricky,
Right.
> but
> it seems that some other item would be much better. How about
> _import "foo": [real * char -> int]
> or use `{' and `}' or a trailing `end'.
Instead of [] or {}, how about requiring the type to be parenthesized?
That works, parsing wise. I wouldn't really mind "end" either,
although I think I would prefer parens.
But, I don't really see the problem with the current approach. ";" is
no less ambiguous than "end", or parens for that matter. The drawback
of changing is that it breaks old code, and that we can't support both
old and new (since the parser can't handle it). I don't see that the
benefit of the change outweighs the drawback.
> All of this really became apparent in the example in the `Calling from C
> to SML' section on page 12:
>
> _export "foo": real * char -> int;
> (fn (x, c) => 13 + Real.floor x + Char.ord c)
>
> Yikes.
Yeah, that was a bit too cute. I rewrote it the the way I always write
exports.
val e = _export "foo": real * char -> int;
val _ = e (fn (x, c) => 13 + Real.floor x + Char.ord c)
Here's a few other changes I made in response to the marked-up
hardcopy that you sent.
* mlprof now only displays a call graph when called with -call-graph
true
* Yes, LargeWord = Word64, not Word32.
* For #line directives, the file name can not contain *). If it does,
then MLton will (correctly) end the comment. For example, the code
(*line 13.1 "foo*)"*)
will cause an unclosed string error, because the first *) ends the
comment, causing the second " to start a string. So, I don't think
there is any incompatibility with the definition regarding #line
directives.
* I changed the type of MLton.Exn.topLevelHandler to "exn -> 'a".
* I changed the text describing of MLton.Pointer.getX (p, i) to read
returns the object stored at index i aof the array of X
objects pointed to by p. For example, getWord32 (p, 7)
returns the 32-bit word stored 28 bytes beyond p.
* MLton.ProcEnv.setenv doesn't require its arguments to be null
terminated. I added this to the documentation.
* The MLton.Process.spawn{,e,p} functions were patterned on the
corresponding Posix.Process.exec{,e,p} functions. I did change the
arguments to be named records instead of tuples, which may have been
a mistake. I could change it back so that they look like the exec
counterparts. However, the record field names were taken straight
from the basis library documentation. I don't think path is such a
bad name for the argument to spawn or spawne, since it is givin a
full path to an executable. The only reason I didn't add
val MLton.Process.spawnpe:
{file: string, args, string list, env: string list} -> pid
is that there is no corresponding Posix.Process.execpe. I could add
MLton.Process.{exec,spawn}pe if needed.
* I clarified the semantics of MLton.Profile when compiling -profile
no.
* I think it is better to return NONE than raise an exception when
/dev/{,u}random can't be read from, as will always happen on
Cygwin. That way, a programmer must explicitly decide what to do.
* I made lots of changes to MLton.Signal. See the latest user guide
and the latest MLton structure.