[MLton-user] SML unicode support

Stephen Weeks MLton-user@mlton.org
Thu, 6 Jan 2005 18:47:19 -0800


> If you know ahead of time that you are going to go between different types
> of Char, then clearly functorizing is the way to go.  The point is that if
> you didn't know, or are going to use some code you found that didn't know,
> then the re-definition of Char seems to be an easy `solution'.

I like the MLB solution, but both it and the functor solution don't go
far enough in achieving what Franck wanted.  They suffer from the
problem that existing libraries that already use char will not be
affected, and will hence be incompatible with your code that makes
Char = WideChar.  This is especially bad because the basis library
uses Char and String.  So if you put

  structure Char = WideChar
  structure String = WideString

in an MLB for your program, you will run into all kinds of problems
trying to use the basis library.  Of course, you could build a wrapper
around the basis library to make it look like its Char is WideChar and
String is WideString, but that's exactly what we're trying to make
easy for the programmer.

The only solution I see is what Franck proposes -- a flag that
specifies the width of the primitive char type.  Something like

	-char-width {8,16,32}

This is a lot of work to implement, but would be very nice to have.
It's also similar to other flags we've been thinking about to specify
the width of the primitive integer, real, and word types.  

We'll do the WideChar stuff first to get the support in, and then
think about the flags.