[MLton] Wide{Char,String}.toCString

Wesley W. Terpstra wesley@terpstra.ca
Tue, 22 Nov 2005 01:29:23 +0100


So, I've been filling out the structures, and I ran into a snag...

In the WideChar structure we have:

> val scan       : (Char.char, 'a) StringCvt.reader
>                    -> (char, 'a) StringCvt.reader
> val fromCString : String.string -> char option 
> val fromString : String.string -> char option

This is no problem; it upcasts Char to WideChar.

> val toString : char -> String.string

More of a problem...
I have to use the MLton specific \U12345678 escape code.

> val toCString : string -> String.string

Worse of a problem.

There is no way to express the unicode chars via C escapes.
I suppose the most 'reasonable' thing I could do would be
to dump the text in as UTF-8, escaped with \x32 codes...

Anyways, what I am wondering, is if they really meant
String.string here, or just string? It seems to mean that
it would be more useful to have toCString output a C string
of the same charset...

Should I just s/String\.string/string/g ?

For now, I am following the spec and raising Chr if toCString
can't fit a code point.

Also, I noticed that standard says:
> In WideChar, the functions toLower, toLower, isAlpha,..., isUpper and, 
> in general, the definition of a `letter'' are locale-dependent
... which I have decided to ignore. 
WideChar will also be locale-independent.

-- 
Wesley W. Terpstra <wesley@terpstra.ca>