[MLton] Re: [Sml-basis-discuss] Unicode and WideChar support
Aaron Turon
adrassi@gmail.com
Tue, 29 Nov 2005 11:18:44 -0600
On 11/29/05, John Reppy <jhr@cs.uchicago.edu> wrote:
> I think that we'll have
>
> val yytext : unit -> substring
>
> where UTF-8 is used to encode unicode characters. We use substrings
> to avoid
> unnecessary copying and a function to be lazy about substring
> creation (our assumption
> is that compilers are better at eliminating unused local functions
> than unused calls
> to external functions that happen to be pure).
That sounds right. On the other hand, we are already doing the work
to decode UTF-8 for performing lexical analysis, so we might be able
to offer an additional value (say, yyutext) that will yield the
decoded substring. Probably that feature could wait until a standard
unicode representation is established.
Aaron