[MLton] WideChar?

Henry Cejtin henry@sourcelight.com
Wed, 8 Dec 2004 18:15:25 -0600


Ah, I forgot about the date-format stuff in locales.  The char-set connection
is things like isAlpha and the like.

As to exceptions vs. returning NONE, I think that if I was going from, say, a
UTF-8 file, then I REALLY want an exception if the bytes are not legal UTF-8.
On the other hand, I really want NONE if I apply  some  Int.scan  to  a  wide
string.

As  you say, the fact that a unicode code needs more than 4 nibbles is really
a problem.  You cannot make the number  of  hex  characters  in  \u  variable
because then it is ambiguous (because you can't tell where the character code
ends).  Always requiring 8 hex digits would really be even more onerous  than
just the fact that you need to use \u at all.

I  still  don't  get the need for any thing other than 1 byte characters (ord
0-255) and 4 byte characters.  I.e., we have  ASCII/ISO-Latin-1  or  else  we
have unicode.

I don't think that the official standard for SML is on line.