[MLton] bug in mllex?
   
    Michael Norrish
     
    Michael Norrish <Michael.Norrish@nicta.com.au>
       
    Thu, 14 Apr 2005 15:23:09 +1000
    
    
  
I believe mllex is generating values for the internal yypos value that
are off by one.  (The generated code initialises a variable yygone0 to
be equal to 1, and I think this should probably be zero.)
The following lex file:
----------------------------------------------------------------------
datatype tok = T of string | EOF
type lexresult = tok * int
fun eof() = (EOF, 0)
%%
space = [\ \t\n];
ident = [A-Za-z]+;
%structure testlex
%%
{ident} => ((T yytext, yypos));
{space} => (lex());
----------------------------------------------------------------------
and the following driver:
----------------------------------------------------------------------
fun read_from_string s = let
  val state = ref (Substring.full s)
  fun reader n = let
    open Substring
  in
    if n >= size (!state) then string (!state) before state := full ""
    else let
        val (left, right) = splitAt (!state, n)
      in
        state := right;
        string left
      end
  end
in
  reader
end
val lexer = testlex.makeLexer (read_from_string "hello world");
val _ = let val t = ref (lexer())
            open testlex.UserDeclarations
        in
          while (#1 (!t) <> EOF) do
            (let val (T s,n) = !t
             in
               print (s ^ ": " ^ Int.toString n ^ "\n");
               t := lexer()
             end)
        end
----------------------------------------------------------------------
will when run print out
hello: 2
world: 8
with 2 and 8 supposedly being the positions of those words.  Clearly,
they should be 1 and 7 (or maybe even 0 and 6, if you believe that
character positions in a file start at zero).
Michael.