[MLton] Re: [Sml-basis-discuss] Unicode and WideChar support
Geoffrey Alan Washburn
geoffw@cis.upenn.edu
Tue, 29 Nov 2005 11:56:16 -0500
This is a multi-part message in MIME format.
--------------050708050503070102060803
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
John Reppy wrote:
> The lexer doesn't generate strings. The input is assumed to be 8-bit
> characters
> (i.e., type char) and one can specify 7-bit, 8-bit, and UTF-8
> interpretations of
> the character stream (ML-lex only supports 7-bit and 8-bit).
Okay, maybe I need to rephrase my question as: If you tell it you
want to use UTF-8 for the input stream,
what type does yytext (or the equivalent) have? Is it just string,
possibly containing sequences of high-bit characters?
--
[Geoff Washburn|geoffw@cis.upenn.edu|http://www.cis.upenn.edu/~geoffw/]
--------------050708050503070102060803
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content=3D"text/html;charset=3DUTF-8" http-equiv=3D"Content-Type"=
>
</head>
<body bgcolor=3D"#ffffee" text=3D"#000000">
John Reppy wrote:
<blockquote
cite=3D"midF0B39B4E-638E-47EF-A0CD-24695FF22C3D@cs.uchicago.edu"
type=3D"cite">The lexer doesn't generate strings.=C2=A0 The input is ass=
umed
to be 8-bit characters
<br>
(i.e., type char) and one can specify 7-bit, 8-bit, and UTF-8
interpretations of
<br>
the character stream (ML-lex only supports 7-bit and 8-bit).
</blockquote>
=C2=A0=C2=A0=C2=A0 Okay, maybe I need to rephrase my question as: If you =
tell it you
want to use UTF-8 for the input stream,<br>
what type does yytext (or the equivalent) have?=C2=A0 Is it just string,
possibly containing sequences of high-bit characters? <br>
<pre class=3D"moz-signature" cols=3D"72">--=20
[Geoff Washburn|<a class=3D"moz-txt-link-abbreviated" href=3D"mailto:geof=
fw@cis.upenn.edu">geoffw@cis.upenn.edu</a>|<a class=3D"moz-txt-link-freet=
ext" href=3D"http://www.cis.upenn.edu/~geoffw/">http://www.cis.upenn.edu/=
~geoffw/</a>]
</pre>
</body>
</html>
--------------050708050503070102060803--