[MLton] Re: [Sml-basis-discuss] Unicode and WideChar support

skaller skaller@users.sourceforge.net
Thu, 01 Dec 2005 00:30:56 +1100


On Wed, 2005-11-30 at 13:49 +0100, Wesley W. Terpstra wrote:

> 1. If you write a string in SML 'val x = "asfasf"', then this string  
> must contain
> the code points which correspond to the symbol with shape 'a', then  
> 's', ...
> When you have a single storage type, with multiple charsets, then  
> this is
> ambiguous. ie: Is #"€" 0xA4 or 0x80? Depends on your charset!

So what? This is what C does. It is correct. The meaning
of "whatever" isn't charset dependent. It is a literal.
It contains whatever you put there with an editor. They're
just bytes. If you want to convert this to a Unicode UCS4
representation, you the programmer have to know what charset
is used. There is no way the system can know: at least on Linux,
bytes are just bytes.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net