[MLton] Unicode / WideChar
Henry Cejtin
henry.cejtin@sbcglobal.net
Sun, 20 Nov 2005 15:37:20 -0600
It would certainly be great to have a real Unicode WideChar.
Re most of your questions, I don't have enough experience to really say. One
thing that I would REALLY like to be true would be for none of these to
depend on locale. Perhaps that isn't possible, but I have definitely been
burned by this dependency.
For digits, I suspect that you really just want ASCII 0-9, but again I am not
at all certain. My notion is that with that definition, lots of number-
cracking code could just switch to WideChar and work.
With regards to tables, I really am torn by this. I agree that 1.1 million
bytes is a bit much. I suspect though that it would still be the fastest
method. This is based on the fact that at least for English, almost all
characters are ASCII, which means that only 128 bytes has to be in the cache
to get a VERY good hit rate.