bug in IntInf.scan StringCvt.HEX
Stephen Weeks
sweeks@intertrust.com
Mon, 10 May 1999 21:44:19 -0700 (PDT)
Here is my assessment of the consequences of the NJ bug on MLton.
G0 will not process IntInf constants correctly. Thus the G1 C code
will contain the SmallIntInf 0 hardwired in whenever a hex IntInf
constant appears in the source code (or basis library). Since there
are hex IntInf constants in the basis library in int-inf.sml it is
possible (but not necessary) that the G1 C code is wrong. This will
very possibly cause G1 to mess up when producing the G2 C code. Now,
we know from the self compile tests that G2 and G3 C code are the
same. However, I don't think from this that we can conclude that they
both don't have bugs involving IntInf constants.
My guess is that the constants used in
basis-library/integer/int-inf.sml weren't used in generating G2 and
hence G2 and G3 are OK. But I don't think it's worth studying in
detail to figure out if that is indeed true. So, here is my patch to
atoms/small-int-inf.fun to work around the NJ bug. I believe that
with this patch, GX works correctly for all X.
--------------------------------------------------------------------------------
(* Copyright (C) 1997-1999 NEC Research Institute.
* Please see the file LICENSE for license information.
*)
functor SmallIntInf(S: SMALL_INT_INF_STRUCTS): SMALL_INT_INF =
struct
open S
val hash = Word.fromLargeWord
structure Word = Word32
type t = Word.word
val equals = op =
fun toCstring w = "0x" ^ Word.toString w
val layout = Layout.str o toCstring
fun toMLstring w = Int.toString(Word.toIntX(Word.~>>(w, 0w1)))
(*
* The IntInf.fromInt calls are just because SML/NJ doesn't
* overload integer constants for IntInf.int's.
* This code relies on the language that MLton is implemented in using at
* least 31 bits for integers.
*)
val minSmall: IntInf.int = IntInf.fromInt ~0x40000000
val maxSmall: IntInf.int = IntInf.fromInt 0x3FFFFFFF
(* Warning: fromString will raise the Subscript exception on inputs "" and "~".
* This code is messier than it needs to be because it works around a bug in the
* SML/NJ implementation of IntInf.scan StringCvt.HEX.
*)
fun fromString (str: string): t option =
let val size = String.size str
fun reader offset =
if offset = size
then NONE
else SOME (String.sub (str, offset), offset + 1)
val start = if String.sub (str, 0) = #"~"
then 1
else 0
(* We carefully avoid calling IntInf.scan on the 0x in a hex constant
* because of the SML/NJ bug.
*)
val (base, start) =
if (String.sub (str, start) = #"0"
andalso (case reader (start + 1) of
SOME (#"x", next) => true
| _ => false))
then (StringCvt.HEX, start + 2)
else (StringCvt.DEC, 0)
val (v, _) = valOf (IntInf.scan base reader start)
(* Do the negation ourselves if we had the scan start inside the
* string.
*)
val v =
case (base, String.sub(str, 0)) of
(StringCvt.HEX, #"~") => IntInf.~ v
| _ => v
in if IntInf.<= (minSmall, v) andalso IntInf.<= (v, maxSmall)
then let val w = Word.fromInt (IntInf.toInt v)
val res = Word.orb (0w1, Word.<< (w, 0w1))
in SOME res
end
else NONE
end
(*
* If you want to compile MLton using an ML implementation which does not
* have IntInf, then use the following instead. Note, in this case Overflow
* MUST be raised.
*
* fun fromString (str: string): t option =
* let val size = String.size str
* fun reader offset =
* if offset = size
* then NONE
* else SOME (String.sub (str, offset), offset + 1)
* val start = if String.sub (str, 0) = #"~"
* then 1
* else 0
* val base = if String.sub (str, start) = #"0"
* then case reader (start + 1) of
* SOME (#"x", next) => StringCvt.HEX
* | _ => StringCvt.DEC
* else StringCvt.DEC
* in (case Pervasive.Int.scan base reader 0 of
* SOME (resv, _) =>
* let val resw = Word.fromInt resv
* val res = Word.orb (0w1, Word.<< (resw, 0w1))
* in if Word.toLargeIntX (Word.xorb (resw, res)) < 0
* then NONE
* else SOME res
* end
* | _ => Error.bug "SmallIntInf.fromString")
* handle Overflow => NONE
* end
*)
end