Wesley W. Terpstra
Tue, 22 Nov 2005 01:29:23 +0100
So, I've been filling out the structures, and I ran into a snag...
In the WideChar structure we have:
> val scan : (Char.char, 'a) StringCvt.reader
> -> (char, 'a) StringCvt.reader
> val fromCString : String.string -> char option
> val fromString : String.string -> char option
This is no problem; it upcasts Char to WideChar.
> val toString : char -> String.string
More of a problem...
I have to use the MLton specific \U12345678 escape code.
> val toCString : string -> String.string
Worse of a problem.
There is no way to express the unicode chars via C escapes.
I suppose the most 'reasonable' thing I could do would be
to dump the text in as UTF-8, escaped with \x32 codes...
Anyways, what I am wondering, is if they really meant
String.string here, or just string? It seems to mean that
it would be more useful to have toCString output a C string
of the same charset...
Should I just s/String\.string/string/g ?
For now, I am following the spec and raising Chr if toCString
can't fit a code point.
Also, I noticed that standard says:
> In WideChar, the functions toLower, toLower, isAlpha,..., isUpper and,
> in general, the definition of a `letter'' are locale-dependent
... which I have decided to ignore.
WideChar will also be locale-independent.
Wesley W. Terpstra <firstname.lastname@example.org>