[MLton] WideChar?
Henry Cejtin
henry@sourcelight.com
Wed, 8 Dec 2004 18:15:25 -0600
Ah, I forgot about the date-format stuff in locales. The char-set connection
is things like isAlpha and the like.
As to exceptions vs. returning NONE, I think that if I was going from, say, a
UTF-8 file, then I REALLY want an exception if the bytes are not legal UTF-8.
On the other hand, I really want NONE if I apply some Int.scan to a wide
string.
As you say, the fact that a unicode code needs more than 4 nibbles is really
a problem. You cannot make the number of hex characters in \u variable
because then it is ambiguous (because you can't tell where the character code
ends). Always requiring 8 hex digits would really be even more onerous than
just the fact that you need to use \u at all.
I still don't get the need for any thing other than 1 byte characters (ord
0-255) and 4 byte characters. I.e., we have ASCII/ISO-Latin-1 or else we
have unicode.
I don't think that the official standard for SML is on line.