[MLton] Long identifiers and def-use data
Matthew Fluet
fluet@cs.cornell.edu
Tue, 18 Apr 2006 09:59:28 -0400 (EDT)
> During the holiday I worked on an Emacs module to automatically highlight
> definitions and uses, while browsing SML code. The module will also support
> some other functionality, such as "jump-to-def" and "jump-to-next-ref". It
> still needs some work, but the highlighting part works under XEmacs (I will
> also port it to GNU Emacs). I noticed this issue while browsing some of my
> code.
Sounds very cool.
> I haven't yet had time to look at the MLton code that produces the def-use
> data. Is there some fundamental reason to treat long identifiers as
> atomic expressions as they are treated currently?
I don't believe that there is any fundamental difficulty with treating
long identifiers in the way you suggest, but I think there are a couple of
reasons that MLton has developed treating long identfiers as single
entities. Probably the most significant is that a careful reading of the
Definition suggests that a long identifier is a single lexical token, not
separate lexical tokens; for example, "Foo . bar" is a syntax error in
MLton, though not in every SML implementation. So, MLton lexes a long
identifier into a single token, and the parser associates one position
with the entire long identifier.
Looking at the source code, it appears that the parser does split the
single token into constituent pieces, but associates the same position
region with every piece. You could probably modify "fromSymbols" in
mlton/ast/longid.fun to compute the sub-region of each piece.
On the other hand, I'm kind of in favor of the current behavior, even for
navigating by use/def in Emacs. In your example, my mental model is that
"v" is being used _as a long identifier_ at position 3.1.