[MLton] Long identifiers and def-use data

Tue, 18 Apr 2006 09:59:28 -0400 (EDT)

> During the holiday I worked on an Emacs module to automatically highlight
> definitions and uses, while browsing SML code.  The module will also support
> some other functionality, such as "jump-to-def" and "jump-to-next-ref".  It
> still needs some work, but the highlighting part works under XEmacs (I will
> also port it to GNU Emacs).  I noticed this issue while browsing some of my
> code.

Sounds very cool.

> I haven't yet had time to look at the MLton code that produces the def-use
> data.  Is there some fundamental reason to treat long identifiers as
> atomic expressions as they are treated currently?

I don't believe that there is any fundamental difficulty with treating 
long identifiers in the way you suggest, but I think there are a couple of 
reasons that MLton has developed treating long identfiers as single 
entities.  Probably the most significant is that a careful reading of the 
Definition suggests that a long identifier is a single lexical token, not 
separate lexical tokens; for example, "Foo . bar" is a syntax error in 
MLton, though not in every SML implementation.  So, MLton lexes a long 
identifier into a single token, and the parser associates one position 
with the entire long identifier.

Looking at the source code, it appears that the parser does split the 
single token into constituent pieces, but associates the same position 
region with every piece.  You could probably modify "fromSymbols" in 
mlton/ast/longid.fun to compute the sub-region of each piece.

On the other hand, I'm kind of in favor of the current behavior, even for 
navigating by use/def in Emacs.  In your example, my mental model is that 
"v" is being used _as a long identifier_ at position 3.1.