[MLton] can mlbasis rename the top-level?
Matthew Fluet
fluet@cs.cornell.edu
Wed, 7 Sep 2005 10:17:42 -0400 (EDT)
> Forgive me for being stupid, but why not just use { and } ?
>
> It's easy to keep a count of the opening and closing {}s in the lexer.
> The only detail would be (afaics) to watch out for nested comments
> (since they could have unmatched {}s), but that just needs another
> state in the lexer.
It's not just comments. Unmatched {}s can also appear in strings (and
character constants). And open comment delimiters in strings shouldn't
actually start a comment. And escaped quotes shouldn't actually close the
string. And string quotes in comments shouldn't actually start a string.
You're required to duplicate, in its entirety, the most complicated
portion of the ML lexer in the MLB lexer. Not to mention that you are
adding the further complication of balancing {}s, a characteristic that is
not enforced by the ML lexer (though it is by the ML parser).
(Admittedly, the MLB lexer does already know how to handle ML-style
comments and ML-style strings (for file-name and annotations).)
But, as I said before, if you could assume that you only ever got
syntactically well-formed source code, then there isn't any problem. The
difficulty arises when a syntactic-error in embedded SML code (which might
have a recognizable/understandable error message if it were lexed/parsed
as SML) yields an unintelligible error related to the MLB lexer/parser.
Or, as Stephen put it: "One should either completely understand the
enclosed language or should not try to understand it at all. Any half-way
point is likely to be wrong or confusing, since it won't be the full
language."