fix for wordfreq for MLton
Henry Cejtin
henry@sourcelight.com
Tue, 12 Jun 2001 23:12:23 -0500
I already E-mail'd Stephen, but he hasn't answered yet: the reason that
wordfreq is so slow in MLton is because of a bug in the hash function used.
I sent the fix to him, but if you just change the hash function so that it
actually loops it will run MUCH better (currently it is only returning
26 different hash values, so the buckets are really big).
Here is the fixed hash:
(* This hash function is taken from pages 56-57 of
* The Practice of Programming by Kernighan and Pike.
*)
fun hash (s: string): word =
let
val n = String.size s
fun loop (i, w) =
if i = n
then w
else loop (i + 1,
Word.fromInt (Char.ord (String.sub (s, i)))
+ Word.* (w, 0w31))
in
loop (0, 0w0)
end
After all, we on the MLton team have to show how fine it is.