[MLton] Unicode / WideChar

Sun, 20 Nov 2005 20:46:29 -0800

> First, the database of these properties is provided in two files
> from unicode.org: UnicodeData.txt and PropList.txt. They total about
> 1M, but compress to about 150K. I think the right thing to do is to
> put these files inside the mlton svn with a tool that can parse them
> and output an appropriate file as part of the basis.

Sounds fine to me.

> The next point of discussion is my take on the is* fields of
> WideChar.

My feeling is that if there are things in Unicode that don't make
sense in WideChar, then those components in WideChar should be
omitted.  No other SML implementation has WideChar, the basis wasn't
really designed with Unicode taken into account, and we're presumably
going to have a Unicode structure anyways, with clear semantics.

So, don't waste too much time trying to shoehorn in something that
doesn't fit.  I don't see any point.