[MLton] WideChar.{chr,ord}

Wesley W.Terpstra wesley@terpstra.ca
Fri, 25 Nov 2005 19:30:49 +0100


--Apple-Mail-17-17426639
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=US-ASCII;
	delsp=yes;
	format=flowed

WideChar.{chr,ord} convert back and forth between
an 'int' and 'char'. Great. Char4 won't fit! I wish I'd
realized this earlier...

Anyways.

Damn.

Char{1,2,4} sound like they're out.
Maybe Char{8,16,21} would be better.

However, the standard forbids this:
The optional WideChar structure defines wide characters, which are  
represented by a fixed number of 8-bit words (bytes)

Also, the standard even manages to screw up normal
Char. It says that 'char' corresponds to the "extended
ASCII 8-bit character set". What exactly is this?

We decided earlier that they really meant ISO-8859-1.
This makes sense because WideChar.chr o Char.ord
will then leave the character unchanged.

However, if Char is really ISO-8559-1, then the
definition of is{Alpha,...} is wrong for Char!!!

WideChar sounded so straight-forward when I started
this, but it's really causing me a headache. I don't think
it is possible to have an implementation that conforms
to both the basis and unicode rules.

I think I'm going to start begging people to make a few
modifications on sml-basis-discuss...


--Apple-Mail-17-17426639
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=ISO-8859-1

<HTML><BODY style=3D"word-wrap: break-word; -khtml-nbsp-mode: space; =
-khtml-line-break: after-white-space; ">WideChar.{chr,ord} convert back =
and forth between<DIV>an 'int' and 'char'. Great. Char4 won't fit!=A0I =
wish I'd=A0</DIV><DIV>realized=A0this earlier...</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>Anyways.</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>Damn.</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>Char{1,2,4} sound like =
they're out.</DIV><DIV>Maybe Char{8,16,21} would be =
better.</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>However, the standard =
forbids this:</DIV><DIV><FONT class=3D"Apple-style-span" face=3D"Times" =
size=3D"4"><SPAN class=3D"Apple-style-span" style=3D"font-size: =
16px;">The optional </SPAN></FONT><FONT class=3D"Apple-style-span" =
face=3D"Courier" size=3D"3"><SPAN class=3D"Apple-style-span" =
style=3D"font-size: 13px;">WideChar</SPAN></FONT><FONT =
class=3D"Apple-style-span" face=3D"Times" size=3D"4"><SPAN =
class=3D"Apple-style-span" style=3D"font-size: 16px;"> structure defines =
wide characters, which are represented by a fixed number of 8-bit words =
(bytes)</SPAN></FONT></DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>Also, the standard even =
manages to screw up normal</DIV><DIV>Char. It says that 'char' =
corresponds to the "extended=A0</DIV><DIV>ASCII=A08-bit character set". =
What exactly is this?</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>We decided earlier that =
they really meant ISO-8859-1.</DIV><DIV>This makes sense because =
WideChar.chr o Char.ord</DIV><DIV>will then leave the character =
unchanged.</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>However, if Char is really =
ISO-8559-1, then the</DIV><DIV>definition of is{Alpha,...} is wrong for =
Char!!!</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV><DIV>WideChar sounded so =
straight-forward when I started</DIV><DIV>this, but it's really causing =
me a headache. I don't think</DIV><DIV>it is possible to have an =
implementation that conforms</DIV><DIV>to both the basis and unicode =
rules.</DIV><DIV><BR class=3D"khtml-block-placeholder"></DIV><DIV>I =
think I'm going to start begging people to make a =
few</DIV><DIV>modifications on sml-basis-discuss...</DIV><DIV><BR =
class=3D"khtml-block-placeholder"></DIV></BODY></HTML>=

--Apple-Mail-17-17426639--