[MLton] cvs commit: added MLton.share

Stephen Weeks sweeks@mlton.org
Mon, 30 Aug 2004 21:56:43 -0700


sweeks      04/08/30 21:56:41

  Modified:    basis-library/misc primitive.sml
               basis-library/mlton mlton.sig mlton.sml
               doc      changelog
               doc/user-guide extensions.tex
               mlton/atoms hash-type.fun prim.fun prim.sig
               mlton/backend object-type.sig packed-representation.fun
                        rep-type.fun rssa.fun runtime.fun runtime.sig
                        ssa-to-rssa.fun
               mlton/codegen/c-codegen c-codegen.fun
               mlton/ssa ssa-tree2.fun
               runtime  gc.c gc.h
  Added:       regression mlton.share.ok mlton.share.sml
               runtime/basis/MLton share.c
  Log:
  MAIL added MLton.share
  
  	val MLton.share: 'a -> unit
  
  MLton.share x maximizes sharing in the heap for the object graph
  reachable from x.  It is implemented by the GC_share runtime function.
  It works by using hash consing, inserted at two places in the
  mark-compact depth-first traversal code.
  
  1. As the traversal of each object is finished, its contents are
  hashed and looked up in the hash table.  If an equivalent object is
  already in the table, then that object is returned.  Otherwise, the
  object just finished is inserted.
  
  2. When an already marked pointer is encountered, the contents of the
  object pointed to are hashed and looked up in the table.  Since the
  object is marked, we know some equivalent object will be there (it may
  be the object itself), so we replace the pointer with that object.
  
  In order for the runtime system to respect SML's notion of
  equivalence, it must have information about which objects require
  identity to be preserved (i.e. ref cells and arrays).  So, I added
  an additional field to GC_objectType, a boolean "hasIdentity", and
  propagated hasIdentity information through the backend into the
  codegens.
  
  Right now, GC_share uses temporary malloc'd space outside the SML heap
  for its hash table.  I plan to add code to use free space at the end
  of the heap if it is available.  However, once that space is used up,
  we either have to allocate additional memory or let the quality of the
  hash-consing degrade.  I'm not sure which is best.
  
  The rehashing of already-marked objects in (2) annoys me a bit.  The
  only way I see to alleviate that is to use an additional hash table,
  keyed on pointers.  I'm not sure if this is worth it, either in code
  complexity or time savings.
  
  Right now, only fixed-size objects are hashed.  It may be worthwhile
  to handle vectors as well.
  
  Next week, I plan to add
  
  	val MLton.shareAll: unit -> unit
  
  which will maximize sharing over the entire heap.  I've already put
  the code in the basis library and a stub in gc.c.  It shouldn't take
  too much to fill it out using ideas similar to the above -- i.e. do a
  mark-compact gc over the whole heap, with a little hash consing as
  objects are finished.  In this case, I may be able to avoid the extra
  hashing of (2) and an extra table by using a trick similar to forward
  pointers.  The reason this works with shareAll and not share is that
  with shareAll we can trash objects that are not inserted into the hash
  table, since we know they will be unreachable.  With share, there may
  be external pointers into the object graph, so we can't trash
  anything.
  
  Henry, since you've implemented all this before in Mathematica, I'd
  appreciate any comments you have.

Revision  Changes    Path
1.119     +2 -0      mlton/basis-library/misc/primitive.sml

Index: primitive.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/basis-library/misc/primitive.sml,v
retrieving revision 1.118
retrieving revision 1.119
diff -u -r1.118 -r1.119
--- primitive.sml	27 Aug 2004 23:24:25 -0000	1.118
+++ primitive.sml	31 Aug 2004 04:56:09 -0000	1.119
@@ -750,6 +750,8 @@
 	    val native = _build_const "MLton_native": bool;
 (*       val deserialize = _prim "MLton_deserialize": Word8Vector.vector -> 'a ref; *)
 (*       val serialize = _prim "MLton_serialize": 'a ref -> Word8Vector.vector; *)
+	    val share = _prim "MLton_share": 'a -> unit;
+	    val shareAll = _import "MLton_shareAll": unit -> unit;
 	    val size = _prim "MLton_size": 'a ref -> int;
 
 	    structure Platform =



1.33      +3 -1      mlton/basis-library/mlton/mlton.sig

Index: mlton.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/basis-library/mlton/mlton.sig,v
retrieving revision 1.32
retrieving revision 1.33
diff -u -r1.32 -r1.33
--- mlton.sig	5 Aug 2004 00:36:48 -0000	1.32
+++ mlton.sig	31 Aug 2004 04:56:19 -0000	1.33
@@ -1,4 +1,4 @@
-(* Copyright (C) 1999-2002 Henry Cejtin, Matthew Fluet, Suresh
+(* Copyright (C) 1999-2004 Henry Cejtin, Matthew Fluet, Suresh
  *    Jagannathan, and Stephen Weeks.
  * Copyright (C) 1997-1999 NEC Research Institute.
  *
@@ -18,6 +18,8 @@
       val isMLton: bool
       val safe: bool
 (*      val serialize: 'a -> Word8Vector.vector *)
+      val share: 'a -> unit
+      val shareAll: unit -> unit
       val size: 'a -> int
 
       structure Array: MLTON_ARRAY



1.35      +4 -1      mlton/basis-library/mlton/mlton.sml

Index: mlton.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/basis-library/mlton/mlton.sml,v
retrieving revision 1.34
retrieving revision 1.35
diff -u -r1.34 -r1.35
--- mlton.sml	11 Jun 2004 12:37:42 -0000	1.34
+++ mlton.sml	31 Aug 2004 04:56:21 -0000	1.35
@@ -1,4 +1,4 @@
-(* Copyright (C) 1999-2002 Henry Cejtin, Matthew Fluet, Suresh
+(* Copyright (C) 1999-2004 Henry Cejtin, Matthew Fluet, Suresh
  *    Jagannathan, and Stephen Weeks.
  * Copyright (C) 1997-1999 NEC Research Institute.
  *
@@ -17,6 +17,9 @@
  *       val deserialize = fn x => !(deserialize x)
  *)
 
+val share = Primitive.MLton.share
+val shareAll = Primitive.MLton.shareAll
+   
 fun size x =
    let val refOverhead = 8 (* header + indirect *)
    in Primitive.MLton.size (ref x) - refOverhead



1.134     +4 -0      mlton/doc/changelog

Index: changelog
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/changelog,v
retrieving revision 1.133
retrieving revision 1.134
diff -u -r1.133 -r1.134
--- changelog	28 Aug 2004 04:12:12 -0000	1.133
+++ changelog	31 Aug 2004 04:56:21 -0000	1.134
@@ -1,5 +1,9 @@
 Here are the changes since version 20040227.
 
+* 2004-08-30
+  - Added val MLton.share: 'a -> unit, which maximizes sharing in a
+    heap object.
+
 * 2004-08-27
   - Fixed bug in Real.toLargeInt.  It would incorrectly raise Option
     instead of Overflow in the case when the real was not an INF, but



1.73      +5 -0      mlton/doc/user-guide/extensions.tex

Index: extensions.tex
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/user-guide/extensions.tex,v
retrieving revision 1.72
retrieving revision 1.73
diff -u -r1.72 -r1.73
--- extensions.tex	22 Aug 2004 20:43:40 -0000	1.72
+++ extensions.tex	31 Aug 2004 04:56:23 -0000	1.73
@@ -29,6 +29,7 @@
       val eq: 'a * 'a -> bool
       val isMLton: bool
       val safe: bool
+      val share: 'a -> unit
       val size: 'a -> int
 
       structure Array: MLTON_ARRAY
@@ -89,6 +90,10 @@
 \end{verbatim}
 When compiled {\tt -safe false}, {\tt sub} will reduce to
 {\tt unsafeSub}.
+
+\entry{share x}
+maximizes sharing in the heap for the object graph reachable from {\tt
+x}.
 
 \entry{size x}
 return the amount of heap space (in bytes) taken by the value of {\tt



1.17      +1 -0      mlton/mlton/atoms/hash-type.fun

Index: hash-type.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/atoms/hash-type.fun,v
retrieving revision 1.16
retrieving revision 1.17
diff -u -r1.16 -r1.17
--- hash-type.fun	7 Jul 2004 02:00:34 -0000	1.16
+++ hash-type.fun	31 Aug 2004 04:56:25 -0000	1.17
@@ -284,6 +284,7 @@
        | MLton_halt => done ([defaultWord], unit)
        | MLton_handlesSignals => done ([], bool)
        | MLton_installSignalHandler => done ([], unit)
+       | MLton_share => oneTarg (fn t => ([t], unit))
        | MLton_size => oneTarg (fn t => ([t], defaultWord))
        | MLton_touch => oneTarg (fn t => ([t], unit))
        | Pointer_getPointer => oneTarg (fn t => ([pointer, defaultWord], t))



1.91      +7 -0      mlton/mlton/atoms/prim.fun

Index: prim.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/atoms/prim.fun,v
retrieving revision 1.90
retrieving revision 1.91
diff -u -r1.90 -r1.91
--- prim.fun	14 Aug 2004 01:34:51 -0000	1.90
+++ prim.fun	31 Aug 2004 04:56:26 -0000	1.91
@@ -74,6 +74,7 @@
   *)
  | MLton_bug (* ssa to rssa *)
  | MLton_deserialize (* unused *)
+ | MLton_share
  | MLton_eq (* codegen *)
  | MLton_equal (* polymorphic equality *)
  | MLton_halt (* ssa to rssa *)
@@ -250,6 +251,7 @@
        | MLton_handlesSignals => "MLton_handlesSignals"
        | MLton_installSignalHandler => "MLton_installSignalHandler"
        | MLton_serialize => "MLton_serialize"
+       | MLton_share => "MLton_share"
        | MLton_size => "MLton_size"
        | MLton_touch => "MLton_touch"
        | Pointer_getPointer => "Pointer_getPointer"
@@ -385,6 +387,7 @@
     | (MLton_handlesSignals, MLton_handlesSignals) => true
     | (MLton_installSignalHandler, MLton_installSignalHandler) => true
     | (MLton_serialize, MLton_serialize) => true
+    | (MLton_share, MLton_share) => true
     | (MLton_size, MLton_size) => true
     | (MLton_touch, MLton_touch) => true
     | (Pointer_getPointer, Pointer_getPointer) => true
@@ -541,6 +544,7 @@
     | MLton_handlesSignals => MLton_handlesSignals
     | MLton_installSignalHandler => MLton_installSignalHandler
     | MLton_serialize => MLton_serialize
+    | MLton_share => MLton_share
     | MLton_size => MLton_size
     | MLton_touch => MLton_touch
     | Pointer_getPointer => Pointer_getPointer
@@ -746,6 +750,7 @@
        | MLton_handlesSignals => Functional
        | MLton_installSignalHandler => SideEffect
        | MLton_serialize => DependsOnState
+       | MLton_share => SideEffect
        | MLton_size => DependsOnState
        | MLton_touch => SideEffect
        | Pointer_getPointer => DependsOnState
@@ -944,6 +949,7 @@
        MLton_handlesSignals,
        MLton_installSignalHandler,
        MLton_serialize,
+       MLton_share,
        MLton_size,
        MLton_touch,
        Pointer_getPointer,
@@ -1061,6 +1067,7 @@
        | MLton_eq => one (arg 0)
        | MLton_equal => one (arg 0)
        | MLton_serialize => one (arg 0)
+       | MLton_share => one (arg 0)
        | MLton_size => one (arg 0)
        | MLton_touch => one (arg 0)
        | Pointer_getPointer => one result



1.68      +1 -0      mlton/mlton/atoms/prim.sig

Index: prim.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/atoms/prim.sig,v
retrieving revision 1.67
retrieving revision 1.68
diff -u -r1.67 -r1.68
--- prim.sig	14 Aug 2004 01:34:51 -0000	1.67
+++ prim.sig	31 Aug 2004 04:56:28 -0000	1.68
@@ -83,6 +83,7 @@
 	     | MLton_handlesSignals (* closure conversion *)
 	     | MLton_installSignalHandler (* backend *)
 	     | MLton_serialize (* unused *)
+	     | MLton_share
 	     | MLton_size (* ssa to rssa *)
 	     | MLton_touch (* backend *)
 	     | Pointer_getPointer (* ssa to rssa *)



1.2       +4 -2      mlton/mlton/backend/object-type.sig

Index: object-type.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/object-type.sig,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- object-type.sig	25 Apr 2004 06:55:44 -0000	1.1
+++ object-type.sig	31 Aug 2004 04:56:29 -0000	1.2
@@ -5,8 +5,10 @@
 	 
       type ty
       datatype t =
-	 Array of ty
-       | Normal of ty
+	 Array of {elt: ty,
+		   hasIdentity: bool}
+       | Normal of {hasIdentity: bool,
+		    ty: ty}
        | Stack
        | Weak of ty (* in Weak t, must have Type.isPointer t *)
        | WeakGone



1.27      +14 -6     mlton/mlton/backend/packed-representation.fun

Index: packed-representation.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/packed-representation.fun,v
retrieving revision 1.26
retrieving revision 1.27
diff -u -r1.26 -r1.27
--- packed-representation.fun	19 Aug 2004 06:15:27 -0000	1.26
+++ packed-representation.fun	31 Aug 2004 04:56:31 -0000	1.27
@@ -2330,13 +2330,16 @@
 					    init = TupleRep.unit}
 			      val () = Vector.foreach (rs, fn r =>
 						       Value.affect (r, tr))
+			      val hasIdentity = Prod.isMutable args
 			      val () =
 				 List.push
 				 (delayedObjectTypes, fn () =>
 				  case Value.get tr of
 				     TupleRep.Indirect pr =>
-					SOME (pt, (ObjectType.Normal
-						   (PointerRep.componentsTy pr)))
+					SOME
+					(pt, (ObjectType.Normal
+					      {hasIdentity = hasIdentity,
+					       ty = PointerRep.componentsTy pr}))
 				   | _ => NONE)
 			      val () = setTupleRep (t, tr)
 			      fun compute () = TupleRep.rep (Value.get tr)
@@ -2349,6 +2352,7 @@
 			   end
 		      | ObjectCon.Vector => 
 			   let
+			      val hasIdentity = Prod.isMutable args
 			      val args = Prod.dest args
 			      fun new () =
 				 let
@@ -2374,12 +2378,15 @@
 						    TupleRep.ty tr
 					       | TupleRep.Indirect pr =>
 						    PointerRep.componentsTy pr
-					   val ty =
+					   val elt =
 					      if Type.isUnit ty
 						 then Type.zero Bits.inByte
 					      else ty
 					in
-					   SOME (pt, ObjectType.Array ty)
+					   SOME (pt,
+						 ObjectType.Array
+						 {elt = elt,
+						  hasIdentity = hasIdentity})
 					end)
 				 in
 				    Type.pointer pt
@@ -2463,11 +2470,12 @@
 	 Vector.fold
 	 (datatypes, [], fn ({cons, ...}, ac) =>
 	  Vector.fold
-	  (cons, ac, fn ({con, pointerTycon, ...}, ac) =>
+	  (cons, ac, fn ({args, con, pointerTycon, ...}, ac) =>
 	   case conRep con of
 	      ConRep.Tuple (TupleRep.Indirect pr) =>
 		 (pointerTycon,
-		  ObjectType.Normal (PointerRep.componentsTy pr)) :: ac
+		  ObjectType.Normal {hasIdentity = Prod.isMutable args,
+				     ty = PointerRep.componentsTy pr}) :: ac
 	    | _ => ac))
       val objectTypes = ref objectTypes
       val () =



1.11      +40 -26    mlton/mlton/backend/rep-type.fun

Index: rep-type.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/rep-type.fun,v
retrieving revision 1.10
retrieving revision 1.11
diff -u -r1.10 -r1.11
--- rep-type.fun	20 Aug 2004 16:34:44 -0000	1.10
+++ rep-type.fun	31 Aug 2004 04:56:31 -0000	1.11
@@ -337,8 +337,10 @@
       type ty = Type.t
 	 
       datatype t =
-	 Array of Type.t
-       | Normal of Type.t
+	 Array of {elt: Type.t,
+		   hasIdentity: bool}
+       | Normal of {hasIdentity: bool,
+		    ty: Type.t}
        | Stack
        | Weak of Type.t
        | WeakGone
@@ -348,8 +350,14 @@
 	    open Layout
 	 in
 	    case t of
-	       Array t => seq [str "Array ", Type.layout t]
-	     | Normal t => seq [str "Normal ", Type.layout t]
+	       Array {elt, hasIdentity} =>
+		  seq [str "Array ",
+		       record [("elt", Type.layout elt),
+			       ("hasIdentity", Bool.layout hasIdentity)]]
+	     | Normal {hasIdentity, ty} =>
+		  seq [str "Normal ",
+		       record [("hasIdentity", Bool.layout hasIdentity),
+			       ("ty", Type.layout ty)]]
 	     | Stack => str "Stack"
 	     | Weak t => seq [str "Weak ", Type.layout t]
 	     | WeakGone => str "WeakGone"
@@ -357,15 +365,15 @@
 
       fun isOk (t: t): bool =
 	 case t of
-	    Array t =>
+	    Array {elt, ...} =>
 	       let
-		  val b = Type.width t
+		  val b = Type.width elt
 	       in
 		  Bits.> (b, Bits.zero)
 		  andalso Bits.isByteAligned b
 	       end
-	  | Normal t =>
-	       not (Type.isUnit t) andalso Bits.isWordAligned (Type.width t)
+	  | Normal {ty, ...} =>
+	       not (Type.isUnit ty) andalso Bits.isWordAligned (Type.width ty)
 	  | Stack => true
 	  | Weak t => Type.isPointer t
 	  | WeakGone => true
@@ -373,14 +381,16 @@
       val stack = Stack
 
       val thread =
-	 Normal (Type.seq
-		 (Vector.new3 (Type.defaultWord,
-			       Type.defaultWord,
-			       Type.stack)))
+	 Normal {hasIdentity = true,
+		 ty = Type.seq (Vector.new3 (Type.defaultWord,
+					     Type.defaultWord,
+					     Type.stack))}
 
-      val word8Vector = Array Type.word8
+      val word8Vector = Array {hasIdentity = true,
+			       elt = Type.word8}
 
-      val wordVector = Array Type.defaultWord
+      val wordVector = Array {hasIdentity = true,
+			      elt = Type.defaultWord}
 
       (* Order in the following vector matters.  The basic pointer tycons must
        * correspond to the constants in gc.h.
@@ -403,18 +413,22 @@
       in
 	 fun toRuntime (t: t): R.t =
 	    case t of
-	       Array t => let
-			     val (b, p) = Type.bytesAndPointers t
-			  in
-			     R.Array {nonPointer = b,
-				      pointers = p}
-			  end
-	     | Normal t => let
-			      val (b, p) = Type.bytesAndPointers t
-			   in
-			      R.Normal {nonPointer = Bytes.toWords b,
-					pointers = p}
-			   end
+	       Array {elt, hasIdentity} =>
+		  let
+		     val (b, p) = Type.bytesAndPointers elt
+		  in
+		     R.Array {hasIdentity = hasIdentity,
+			      nonPointer = b,
+			      pointers = p}
+		  end
+	     | Normal {hasIdentity, ty} =>
+		  let
+		     val (b, p) = Type.bytesAndPointers ty
+		  in
+		     R.Normal {hasIdentity = hasIdentity,
+			       nonPointer = Bytes.toWords b,
+			       pointers = p}
+		  end
 	     | Stack => R.Stack
 	     | Weak _ => R.Weak
 	     | WeakGone => R.WeakGone



1.67      +2 -2      mlton/mlton/backend/rssa.fun

Index: rssa.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/rssa.fun,v
retrieving revision 1.66
retrieving revision 1.67
diff -u -r1.66 -r1.67
--- rssa.fun	28 Aug 2004 20:03:47 -0000	1.66
+++ rssa.fun	31 Aug 2004 04:56:32 -0000	1.67
@@ -1189,10 +1189,10 @@
 					     | Control.Align8 => 8))}))
 			   andalso
 			   (case tyconTy tycon of
-			       ObjectType.Normal t =>
+			       ObjectType.Normal {ty, ...} =>
 				  Bytes.equals
 				  (size, Bytes.+ (Runtime.normalHeaderSize,
-						  Type.bytes t))
+						  Type.bytes ty))
 			      | _ => false)
 			end
 		   | PrimApp {args, dst, prim} =>



1.20      +10 -6     mlton/mlton/backend/runtime.fun

Index: runtime.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/runtime.fun,v
retrieving revision 1.19
retrieving revision 1.20
diff -u -r1.19 -r1.20
--- runtime.fun	25 Apr 2004 06:55:44 -0000	1.19
+++ runtime.fun	31 Aug 2004 04:56:34 -0000	1.20
@@ -105,9 +105,11 @@
 structure RObjectType =
    struct
       datatype t =
-	 Array of {nonPointer: Bytes.t,
+	 Array of {hasIdentity: bool,
+		   nonPointer: Bytes.t,
 		   pointers: int}
-       | Normal of {nonPointer: Words.t,
+       | Normal of {hasIdentity: bool,
+		    nonPointer: Words.t,
 		    pointers: int}
        | Stack
        | Weak
@@ -118,13 +120,15 @@
 	    open Layout
 	 in
 	    case t of
-	       Array {nonPointer = np, pointers = p} =>
+	       Array {hasIdentity, nonPointer = np, pointers = p} =>
 		  seq [str "Array ",
-		       record [("nonPointer", Bytes.layout np),
+		       record [("hasIdentity", Bool.layout hasIdentity),
+			       ("nonPointer", Bytes.layout np),
 			       ("pointers", Int.layout p)]]
-	     | Normal {nonPointer = np, pointers = p} =>
+	     | Normal {hasIdentity, nonPointer = np, pointers = p} =>
 		  seq [str "Normal ",
-		       record [("nonPointer", Words.layout np),
+		       record [("hasIdentity", Bool.layout hasIdentity),
+			       ("nonPointer", Words.layout np),
 			       ("pointers", Int.layout p)]]
 	     | Stack => str "Stack"
 	     | Weak => str "Weak"



1.27      +4 -2      mlton/mlton/backend/runtime.sig

Index: runtime.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/runtime.sig,v
retrieving revision 1.26
retrieving revision 1.27
diff -u -r1.26 -r1.27
--- runtime.sig	25 Apr 2004 06:55:44 -0000	1.26
+++ runtime.sig	31 Aug 2004 04:56:36 -0000	1.27
@@ -52,9 +52,11 @@
       structure RObjectType:
 	 sig
 	    datatype t =
-	       Array of {nonPointer: Bytes.t,
+	       Array of {hasIdentity: bool,
+			 nonPointer: Bytes.t,
 			 pointers: int}
-	     | Normal of {nonPointer: Words.t,
+	     | Normal of {hasIdentity: bool,
+			  nonPointer: Words.t,
 			  pointers: int}
 	     | Stack
 	     | Weak



1.96      +14 -0     mlton/mlton/backend/ssa-to-rssa.fun

Index: ssa-to-rssa.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/ssa-to-rssa.fun,v
retrieving revision 1.95
retrieving revision 1.96
diff -u -r1.95 -r1.96
--- ssa-to-rssa.fun	20 Aug 2004 16:34:44 -0000	1.95
+++ ssa-to-rssa.fun	31 Aug 2004 04:56:36 -0000	1.96
@@ -184,6 +184,11 @@
 	    return = unit,
 	    writesStackTop = true}
 
+      fun share t =
+	 vanilla {args = Vector.new1 t,
+		  name = "MLton_share",
+		  return = unit}
+
       fun size t =
 	 vanilla {args = Vector.new1 t,
 		  name = "MLton_size",
@@ -896,6 +901,15 @@
 					   (Prim.wordEqual
 					    (WordSize.fromBits (Type.width t))))
 			       | MLton_installSignalHandler => none ()
+			       | MLton_share =>
+				    (case toRtype (varType (arg 0)) of
+					NONE => none ()
+				      | SOME t =>
+					   if not (Type.isPointer t)
+					      then none ()
+					   else
+					      simpleCCall (CFunction.share
+							   (Operand.ty (a 0))))
 			       | MLton_size =>
 				    simpleCCall
 				    (CFunction.size (Operand.ty (a 0)))



1.91      +9 -8      mlton/mlton/codegen/c-codegen/c-codegen.fun

Index: c-codegen.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/codegen/c-codegen/c-codegen.fun,v
retrieving revision 1.90
retrieving revision 1.91
diff -u -r1.90 -r1.91
--- c-codegen.fun	28 Aug 2004 20:14:45 -0000	1.90
+++ c-codegen.fun	31 Aug 2004 04:56:38 -0000	1.91
@@ -338,20 +338,21 @@
 	  fn (_, ty) =>
 	  let
 	     datatype z = datatype Runtime.RObjectType.t
-	     val (tag, nonPointers, pointers) =
+	     val (tag, hasIdentity, nonPointers, pointers) =
 		case ObjectType.toRuntime ty of
-		   Array {nonPointer, pointers} =>
-		      (0, Bytes.toInt nonPointer, pointers)
-		 | Normal {nonPointer, pointers} =>
-		      (1, Words.toInt nonPointer, pointers)
+		   Array {hasIdentity, nonPointer, pointers} =>
+		      (0, hasIdentity, Bytes.toInt nonPointer, pointers)
+		 | Normal {hasIdentity, nonPointer, pointers} =>
+		      (1, hasIdentity, Words.toInt nonPointer, pointers)
 		 | Stack =>
-		      (2, 0, 0)
+		      (2, false, 0, 0)
 		 | Weak =>
-		      (3, 2, 1)
+		      (3, false, 2, 1)
 		 | WeakGone =>
-		      (3, 3, 0)
+		      (3, false, 3, 0)
 	  in
 	     concat ["{ ", C.int tag, ", ",
+		     C.bool hasIdentity, ", ",
 		     C.int nonPointers, ", ",
 		     C.int pointers, " }"]
 	  end)



1.21      +1 -0      mlton/mlton/ssa/ssa-tree2.fun

Index: ssa-tree2.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/ssa/ssa-tree2.fun,v
retrieving revision 1.20
retrieving revision 1.21
diff -u -r1.20 -r1.21
--- ssa-tree2.fun	27 Aug 2004 23:07:37 -0000	1.20
+++ ssa-tree2.fun	31 Aug 2004 04:56:40 -0000	1.21
@@ -398,6 +398,7 @@
 	     | MLton_halt => done ([defaultWord], unit)
 	     | MLton_handlesSignals => done ([], bool)
 	     | MLton_installSignalHandler => done ([], unit)
+	     | MLton_share => oneArg (fn x => done ([x], unit))
 	     | MLton_size => oneArg (fn x => done ([x], defaultWord))
 	     | MLton_touch => oneArg (fn x => done ([x], unit))
 	     | Pointer_getPointer =>



1.1                  mlton/regression/mlton.share.ok

Index: mlton.share.ok
===================================================================
size of a is 1600
0 => NONE
1 => (1, 1)
2 => (0, 2)
3 => (1, 0)
4 => (0, 1)
5 => (1, 2)
6 => (0, 0)
7 => (1, 1)
8 => (0, 2)
9 => (1, 0)
10 => (0, 1)
11 => (1, 2)
12 => (0, 0)
13 => (1, 1)
14 => (0, 2)
15 => (1, 0)
16 => (0, 1)
17 => (1, 2)
18 => (0, 0)
19 => (1, 1)
20 => (0, 2)
21 => (1, 0)
22 => (0, 1)
23 => (1, 2)
24 => (0, 0)
25 => (1, 1)
26 => (0, 2)
27 => (1, 0)
28 => (0, 1)
29 => (1, 2)
30 => (0, 0)
31 => (1, 1)
32 => (0, 2)
33 => (1, 0)
34 => (0, 1)
35 => (1, 2)
36 => (0, 0)
37 => (1, 1)
38 => (0, 2)
39 => (1, 0)
40 => (0, 1)
41 => (1, 2)
42 => (0, 0)
43 => (1, 1)
44 => (0, 2)
45 => (1, 0)
46 => (0, 1)
47 => (1, 2)
48 => (0, 0)
49 => (1, 1)
50 => (0, 2)
51 => (1, 0)
52 => (0, 1)
53 => (1, 2)
54 => (0, 0)
55 => (1, 1)
56 => (0, 2)
57 => (1, 0)
58 => (0, 1)
59 => (1, 2)
60 => (0, 0)
61 => (1, 1)
62 => (0, 2)
63 => (1, 0)
64 => (0, 1)
65 => (1, 2)
66 => (0, 0)
67 => (1, 1)
68 => (0, 2)
69 => (1, 0)
70 => (0, 1)
71 => (1, 2)
72 => (0, 0)
73 => (1, 1)
74 => (0, 2)
75 => (1, 0)
76 => (0, 1)
77 => (1, 2)
78 => (0, 0)
79 => (1, 1)
80 => (0, 2)
81 => (1, 0)
82 => (0, 1)
83 => (1, 2)
84 => (0, 0)
85 => (1, 1)
86 => (0, 2)
87 => (1, 0)
88 => (0, 1)
89 => (1, 2)
90 => (0, 0)
91 => (1, 1)
92 => (0, 2)
93 => (1, 0)
94 => (0, 1)
95 => (1, 2)
96 => (0, 0)
97 => (1, 1)
98 => (0, 2)
99 => (1, 0)
size of a is 484
0 => NONE
1 => (1, 1)
2 => (0, 2)
3 => (1, 0)
4 => (0, 1)
5 => (1, 2)
6 => (0, 0)
7 => (1, 1)
8 => (0, 2)
9 => (1, 0)
10 => (0, 1)
11 => (1, 2)
12 => (0, 0)
13 => (1, 1)
14 => (0, 2)
15 => (1, 0)
16 => (0, 1)
17 => (1, 2)
18 => (0, 0)
19 => (1, 1)
20 => (0, 2)
21 => (1, 0)
22 => (0, 1)
23 => (1, 2)
24 => (0, 0)
25 => (1, 1)
26 => (0, 2)
27 => (1, 0)
28 => (0, 1)
29 => (1, 2)
30 => (0, 0)
31 => (1, 1)
32 => (0, 2)
33 => (1, 0)
34 => (0, 1)
35 => (1, 2)
36 => (0, 0)
37 => (1, 1)
38 => (0, 2)
39 => (1, 0)
40 => (0, 1)
41 => (1, 2)
42 => (0, 0)
43 => (1, 1)
44 => (0, 2)
45 => (1, 0)
46 => (0, 1)
47 => (1, 2)
48 => (0, 0)
49 => (1, 1)
50 => (0, 2)
51 => (1, 0)
52 => (0, 1)
53 => (1, 2)
54 => (0, 0)
55 => (1, 1)
56 => (0, 2)
57 => (1, 0)
58 => (0, 1)
59 => (1, 2)
60 => (0, 0)
61 => (1, 1)
62 => (0, 2)
63 => (1, 0)
64 => (0, 1)
65 => (1, 2)
66 => (0, 0)
67 => (1, 1)
68 => (0, 2)
69 => (1, 0)
70 => (0, 1)
71 => (1, 2)
72 => (0, 0)
73 => (1, 1)
74 => (0, 2)
75 => (1, 0)
76 => (0, 1)
77 => (1, 2)
78 => (0, 0)
79 => (1, 1)
80 => (0, 2)
81 => (1, 0)
82 => (0, 1)
83 => (1, 2)
84 => (0, 0)
85 => (1, 1)
86 => (0, 2)
87 => (1, 0)
88 => (0, 1)
89 => (1, 2)
90 => (0, 0)
91 => (1, 1)
92 => (0, 2)
93 => (1, 0)
94 => (0, 1)
95 => (1, 2)
96 => (0, 0)
97 => (1, 1)
98 => (0, 2)
99 => (1, 0)
size of a is 1024
0 => NONE
1 => (1, 1)
2 => (1, 1)
3 => (0, 0)
4 => (1, 1)
5 => (2, 2)
6 => (1, 1)
7 => (1, 1)
8 => (1, 1)
9 => (0, 0)
10 => (1, 1)
11 => (2, 2)
12 => (1, 1)
13 => (1, 1)
14 => (1, 1)
15 => (0, 0)
16 => (1, 1)
17 => (2, 2)
18 => (1, 1)
19 => (1, 1)
20 => (1, 1)
21 => (0, 0)
22 => (1, 1)
23 => (2, 2)
24 => (1, 1)
25 => (1, 1)
26 => (1, 1)
27 => (0, 0)
28 => (1, 1)
29 => (2, 2)
30 => (1, 1)
31 => (1, 1)
32 => (1, 1)
33 => (0, 0)
34 => (1, 1)
35 => (2, 2)
36 => (1, 1)
37 => (1, 1)
38 => (1, 1)
39 => (0, 0)
40 => (1, 1)
41 => (2, 2)
42 => (1, 1)
43 => (1, 1)
44 => (1, 1)
45 => (0, 0)
46 => (1, 1)
47 => (2, 2)
48 => (1, 1)
49 => (1, 1)
50 => (1, 1)
51 => (0, 0)
52 => (1, 1)
53 => (2, 2)
54 => (1, 1)
55 => (1, 1)
56 => (1, 1)
57 => (0, 0)
58 => (1, 1)
59 => (2, 2)
60 => (1, 1)
61 => (1, 1)
62 => (1, 1)
63 => (0, 0)
64 => (1, 1)
65 => (2, 2)
66 => (1, 1)
67 => (1, 1)
68 => (1, 1)
69 => (0, 0)
70 => (1, 1)
71 => (2, 2)
72 => (1, 1)
73 => (1, 1)
74 => (1, 1)
75 => (0, 0)
76 => (1, 1)
77 => (2, 2)
78 => (1, 1)
79 => (1, 1)
80 => (1, 1)
81 => (0, 0)
82 => (1, 1)
83 => (2, 2)
84 => (1, 1)
85 => (1, 1)
86 => (1, 1)
87 => (0, 0)
88 => (1, 1)
89 => (2, 2)
90 => (1, 1)
91 => (1, 1)
92 => (1, 1)
93 => (0, 0)
94 => (1, 1)
95 => (2, 2)
96 => (1, 1)
97 => (1, 1)
98 => (1, 1)
99 => (0, 0)
size of a is 448
0 => NONE
1 => (1, 1)
2 => (1, 1)
3 => (0, 0)
4 => (1, 1)
5 => (2, 2)
6 => (1, 1)
7 => (1, 1)
8 => (1, 1)
9 => (0, 0)
10 => (1, 1)
11 => (2, 2)
12 => (1, 1)
13 => (1, 1)
14 => (1, 1)
15 => (0, 0)
16 => (1, 1)
17 => (2, 2)
18 => (1, 1)
19 => (1, 1)
20 => (1, 1)
21 => (0, 0)
22 => (1, 1)
23 => (2, 2)
24 => (1, 1)
25 => (1, 1)
26 => (1, 1)
27 => (0, 0)
28 => (1, 1)
29 => (2, 2)
30 => (1, 1)
31 => (1, 1)
32 => (1, 1)
33 => (0, 0)
34 => (1, 1)
35 => (2, 2)
36 => (1, 1)
37 => (1, 1)
38 => (1, 1)
39 => (0, 0)
40 => (1, 1)
41 => (2, 2)
42 => (1, 1)
43 => (1, 1)
44 => (1, 1)
45 => (0, 0)
46 => (1, 1)
47 => (2, 2)
48 => (1, 1)
49 => (1, 1)
50 => (1, 1)
51 => (0, 0)
52 => (1, 1)
53 => (2, 2)
54 => (1, 1)
55 => (1, 1)
56 => (1, 1)
57 => (0, 0)
58 => (1, 1)
59 => (2, 2)
60 => (1, 1)
61 => (1, 1)
62 => (1, 1)
63 => (0, 0)
64 => (1, 1)
65 => (2, 2)
66 => (1, 1)
67 => (1, 1)
68 => (1, 1)
69 => (0, 0)
70 => (1, 1)
71 => (2, 2)
72 => (1, 1)
73 => (1, 1)
74 => (1, 1)
75 => (0, 0)
76 => (1, 1)
77 => (2, 2)
78 => (1, 1)
79 => (1, 1)
80 => (1, 1)
81 => (0, 0)
82 => (1, 1)
83 => (2, 2)
84 => (1, 1)
85 => (1, 1)
86 => (1, 1)
87 => (0, 0)
88 => (1, 1)
89 => (2, 2)
90 => (1, 1)
91 => (1, 1)
92 => (1, 1)
93 => (0, 0)
94 => (1, 1)
95 => (2, 2)
96 => (1, 1)
97 => (1, 1)
98 => (1, 1)
99 => (0, 0)
size of a is 2400
0 => NONE
1 => (1, 1)
2 => (0, 2)
3 => (1, 0)
4 => (0, 1)
5 => (1, 2)
6 => (0, 0)
7 => (1, 1)
8 => (0, 2)
9 => (1, 0)
10 => (0, 1)
11 => (1, 2)
12 => (0, 0)
13 => (1, 1)
14 => (0, 2)
15 => (1, 0)
16 => (0, 1)
17 => (1, 2)
18 => (0, 0)
19 => (1, 1)
20 => (0, 2)
21 => (1, 0)
22 => (0, 1)
23 => (1, 2)
24 => (0, 0)
25 => (1, 1)
26 => (0, 2)
27 => (1, 0)
28 => (0, 1)
29 => (1, 2)
30 => (0, 0)
31 => (1, 1)
32 => (0, 2)
33 => (1, 0)
34 => (0, 1)
35 => (1, 2)
36 => (0, 0)
37 => (1, 1)
38 => (0, 2)
39 => (1, 0)
40 => (0, 1)
41 => (1, 2)
42 => (0, 0)
43 => (1, 1)
44 => (0, 2)
45 => (1, 0)
46 => (0, 1)
47 => (1, 2)
48 => (0, 0)
49 => (1, 1)
50 => (0, 2)
51 => (1, 0)
52 => (0, 1)
53 => (1, 2)
54 => (0, 0)
55 => (1, 1)
56 => (0, 2)
57 => (1, 0)
58 => (0, 1)
59 => (1, 2)
60 => (0, 0)
61 => (1, 1)
62 => (0, 2)
63 => (1, 0)
64 => (0, 1)
65 => (1, 2)
66 => (0, 0)
67 => (1, 1)
68 => (0, 2)
69 => (1, 0)
70 => (0, 1)
71 => (1, 2)
72 => (0, 0)
73 => (1, 1)
74 => (0, 2)
75 => (1, 0)
76 => (0, 1)
77 => (1, 2)
78 => (0, 0)
79 => (1, 1)
80 => (0, 2)
81 => (1, 0)
82 => (0, 1)
83 => (1, 2)
84 => (0, 0)
85 => (1, 1)
86 => (0, 2)
87 => (1, 0)
88 => (0, 1)
89 => (1, 2)
90 => (0, 0)
91 => (1, 1)
92 => (0, 2)
93 => (1, 0)
94 => (0, 1)
95 => (1, 2)
96 => (0, 0)
97 => (1, 1)
98 => (0, 2)
99 => (1, 0)
size of a is 1284
0 => NONE
1 => (1, 1)
2 => (0, 2)
3 => (1, 0)
4 => (0, 1)
5 => (1, 2)
6 => (0, 0)
7 => (1, 1)
8 => (0, 2)
9 => (1, 0)
10 => (0, 1)
11 => (1, 2)
12 => (0, 0)
13 => (1, 1)
14 => (0, 2)
15 => (1, 0)
16 => (0, 1)
17 => (1, 2)
18 => (0, 0)
19 => (1, 1)
20 => (0, 2)
21 => (1, 0)
22 => (0, 1)
23 => (1, 2)
24 => (0, 0)
25 => (1, 1)
26 => (0, 2)
27 => (1, 0)
28 => (0, 1)
29 => (1, 2)
30 => (0, 0)
31 => (1, 1)
32 => (0, 2)
33 => (1, 0)
34 => (0, 1)
35 => (1, 2)
36 => (0, 0)
37 => (1, 1)
38 => (0, 2)
39 => (1, 0)
40 => (0, 1)
41 => (1, 2)
42 => (0, 0)
43 => (1, 1)
44 => (0, 2)
45 => (1, 0)
46 => (0, 1)
47 => (1, 2)
48 => (0, 0)
49 => (1, 1)
50 => (0, 2)
51 => (1, 0)
52 => (0, 1)
53 => (1, 2)
54 => (0, 0)
55 => (1, 1)
56 => (0, 2)
57 => (1, 0)
58 => (0, 1)
59 => (1, 2)
60 => (0, 0)
61 => (1, 1)
62 => (0, 2)
63 => (1, 0)
64 => (0, 1)
65 => (1, 2)
66 => (0, 0)
67 => (1, 1)
68 => (0, 2)
69 => (1, 0)
70 => (0, 1)
71 => (1, 2)
72 => (0, 0)
73 => (1, 1)
74 => (0, 2)
75 => (1, 0)
76 => (0, 1)
77 => (1, 2)
78 => (0, 0)
79 => (1, 1)
80 => (0, 2)
81 => (1, 0)
82 => (0, 1)
83 => (1, 2)
84 => (0, 0)
85 => (1, 1)
86 => (0, 2)
87 => (1, 0)
88 => (0, 1)
89 => (1, 2)
90 => (0, 0)
91 => (1, 1)
92 => (0, 2)
93 => (1, 0)
94 => (0, 1)
95 => (1, 2)
96 => (0, 0)
97 => (1, 1)
98 => (0, 2)
99 => (1, 0)
size of a is 2400
0 => NONE
1 => (1, 1)
2 => (0, 2)
3 => (1, 0)
4 => (0, 1)
5 => (1, 2)
6 => (0, 0)
7 => (1, 1)
8 => (0, 2)
9 => (1, 0)
10 => (0, 1)
11 => (1, 2)
12 => (0, 0)
13 => (1, 1)
14 => (0, 2)
15 => (1, 0)
16 => (0, 1)
17 => (1, 2)
18 => (0, 0)
19 => (1, 1)
20 => (0, 2)
21 => (1, 0)
22 => (0, 1)
23 => (1, 2)
24 => (0, 0)
25 => (1, 1)
26 => (0, 2)
27 => (1, 0)
28 => (0, 1)
29 => (1, 2)
30 => (0, 0)
31 => (1, 1)
32 => (0, 2)
33 => (1, 0)
34 => (0, 1)
35 => (1, 2)
36 => (0, 0)
37 => (1, 1)
38 => (0, 2)
39 => (1, 0)
40 => (0, 1)
41 => (1, 2)
42 => (0, 0)
43 => (1, 1)
44 => (0, 2)
45 => (1, 0)
46 => (0, 1)
47 => (1, 2)
48 => (0, 0)
49 => (1, 1)
50 => (0, 2)
51 => (1, 0)
52 => (0, 1)
53 => (1, 2)
54 => (0, 0)
55 => (1, 1)
56 => (0, 2)
57 => (1, 0)
58 => (0, 1)
59 => (1, 2)
60 => (0, 0)
61 => (1, 1)
62 => (0, 2)
63 => (1, 0)
64 => (0, 1)
65 => (1, 2)
66 => (0, 0)
67 => (1, 1)
68 => (0, 2)
69 => (1, 0)
70 => (0, 1)
71 => (1, 2)
72 => (0, 0)
73 => (1, 1)
74 => (0, 2)
75 => (1, 0)
76 => (0, 1)
77 => (1, 2)
78 => (0, 0)
79 => (1, 1)
80 => (0, 2)
81 => (1, 0)
82 => (0, 1)
83 => (1, 2)
84 => (0, 0)
85 => (1, 1)
86 => (0, 2)
87 => (1, 0)
88 => (0, 1)
89 => (1, 2)
90 => (0, 0)
91 => (1, 1)
92 => (0, 2)
93 => (1, 0)
94 => (0, 1)
95 => (1, 2)
96 => (0, 0)
97 => (1, 1)
98 => (0, 2)
99 => (1, 0)
size of a is 1600000
(1, 1)
size of a is 400084
(1, 1)



1.1                  mlton/regression/mlton.share.sml

Index: mlton.share.sml
===================================================================
(* tuple option array *)
val a = Array.tabulate (100, fn i => SOME (i mod 2, i mod 3))
val () = Array.update (a, 0, NONE)

fun msg () =
   (print (concat ["size of a is ", Int.toString (MLton.size a), "\n"])
    ; Array.appi (fn (i, z) =>
		  print (concat [Int.toString i, " => ",
				 case z of
				    NONE => "NONE"
				  | SOME (a, b) => 
				       concat ["(", Int.toString a, ", ",
					       Int.toString b, ")"],
				 "\n"])) a)

val () = msg ()
val () = MLton.share a
val () = msg ()

(* tuple option array with pre-existing sharing *)
val a = Array.tabulate (100, fn i =>
			if i mod 2 = 0
			   then SOME (1, 1)
			else SOME (i mod 3, i mod 3))
val () = Array.update (a, 0, NONE)
fun msg () =
   (print (concat ["size of a is ", Int.toString (MLton.size a), "\n"])
    ; Array.appi (fn (i, z) =>
		  print (concat [Int.toString i, " => ",
				 case z of
				    NONE => "NONE"
				  | SOME (a, b) => 
				       concat ["(", Int.toString a, ", ",
					       Int.toString b, ")"],
				       "\n"])) a)
val () = msg ()
val () = MLton.share a
val () = msg ()

(* tuple option ref array *)
   
val a = Array.tabulate (100, fn i => ref (SOME (i mod 2, i mod 3)))
val () = Array.sub (a, 0) := NONE

fun msg () =
   (print (concat ["size of a is ", Int.toString (MLton.size a), "\n"])
    ; Array.appi (fn (i, z) =>
		  print (concat [Int.toString i, " => ",
				 case !z of
				    NONE => "NONE"
				  | SOME (a, b) => 
				       concat ["(", Int.toString a, ", ",
					       Int.toString b, ")"],
				 "\n"])) a)

val () = msg ()
val () = MLton.share a
val () = msg ()
val () = Array.appi (fn (i, r) =>
		     r := (if i = 0 then NONE else (SOME (i mod 2, i mod 3)))) a
val () = msg ()

(* big tuple option array *)
val a = Array.tabulate (100000, fn i => SOME (i mod 2, i mod 3))
val () = Array.update (a, 0, NONE)

fun msg () =
   print (concat ["size of a is ", Int.toString (MLton.size a), "\n",
		  case Array.sub (a, 1) of
		     NONE => "NONE"
		   | SOME (a, b) => 
			concat ["(", Int.toString a, ", ", Int.toString b, ")"],
			"\n"])
   
val () = msg ()
val () = MLton.share a
val () = msg ()



1.201     +185 -14   mlton/runtime/gc.c

Index: gc.c
===================================================================
RCS file: /cvsroot/mlton/mlton/runtime/gc.c,v
retrieving revision 1.200
retrieving revision 1.201
diff -u -r1.200 -r1.201
--- gc.c	27 Aug 2004 00:50:42 -0000	1.200
+++ gc.c	31 Aug 2004 04:56:40 -0000	1.201
@@ -39,6 +39,7 @@
 	DEBUG_MARK_COMPACT = FALSE,
 	DEBUG_MEM = FALSE,
 	DEBUG_RESIZING = FALSE,
+	DEBUG_SHARE = FALSE,
 	DEBUG_STACKS = FALSE,
 	DEBUG_THREADS = FALSE,
 	DEBUG_WEAK = FALSE,
@@ -72,6 +73,7 @@
 				and objectTypeIndex < s->objectTypesSize);	\
 		t = &s->objectTypes [objectTypeIndex];				\
 		tag = t->tag;							\
+		hasIdentity = t->hasIdentity;					\
 		numNonPointers = t->numNonPointers;				\
 		numPointers = t->numPointers;					\
 		if (DEBUG_DETAILED)						\
@@ -717,6 +719,7 @@
 				pointer a, 
 				uint arrayIndex, 
 				uint pointerIndex) {
+	Bool hasIdentity;
 	word header;
 	uint numPointers;
 	uint numNonPointers;
@@ -763,6 +766,7 @@
 static inline pointer foreachPointerInObject (GC_state s, pointer p,
 						Bool skipWeaks,
 						GC_pointerFun f) {
+	Bool hasIdentity;
 	word header;
 	uint numPointers;
 	uint numNonPointers;
@@ -1378,7 +1382,7 @@
 static bool heapCreate (GC_state s, GC_heap h, W32 desiredSize, W32 minSize) {
 	W32 backoff;
 
-	if (DEBUG)
+	if (DEBUG_MEM)
 		fprintf (stderr, "heapCreate  desired size = %s  min size = %s\n",
 				uintToCommaString (desiredSize),
 				uintToCommaString (minSize));
@@ -1438,11 +1442,12 @@
 }
 
 static inline uint objectSize (GC_state s, pointer p) {
+	Bool hasIdentity;
 	uint headerBytes, objectBytes;
        	word header;
 	uint tag, numPointers, numNonPointers;
 
-	header = GC_getHeader(p);
+	header = GC_getHeader (p);
 	SPLIT_HEADER();
 	if (NORMAL_TAG == tag) { /* Fixed size object. */
 		headerBytes = GC_NORMAL_HEADER_SIZE;
@@ -1482,6 +1487,7 @@
 	if (DEBUG_DETAILED and FORWARDED == header)
 		fprintf (stderr, "already FORWARDED\n");
 	if (header != FORWARDED) { /* forward the object */
+		Bool hasIdentity;
 		uint headerBytes, objectBytes, size, skip;
 		uint numPointers, numNonPointers;
 
@@ -1930,6 +1936,138 @@
 }
 
 /* ---------------------------------------------------------------- */
+/*                       Object hash consing                        */
+/* ---------------------------------------------------------------- */
+
+static GC_ObjectHashTable newTable () {
+	GC_ObjectHashTable t;
+
+	NEW (t);
+	t->numElements = 0;
+	t->elementsSize = 1024; /* pretty arbitrary. */
+	ARRAY (t->elements, t->elementsSize);
+	return t;
+}
+
+static void destroyTable (GC_ObjectHashTable t) {
+	free (t->elements);
+	free (t);
+}
+
+static void tableGrow (GC_ObjectHashTable t) {	
+	struct GC_ObjectHashElement *elements0;
+	int i;
+	int s0;
+
+	if (DEBUG_SHARE)
+		fprintf (stderr, "tableGrow\n");
+	s0 = t->elementsSize;
+	t->elementsSize *= 2;
+	elements0 = t->elements;
+	ARRAY (t->elements, t->elementsSize);
+	for (i = 0; i < s0; ++i) {
+		GC_ObjectHashElement e;
+		GC_ObjectHashElement e0;
+
+		e0 = &elements0[i];
+		unless (NULL == e0->object) {
+			for (e = &t->elements[e0->hash % t->elementsSize]; 
+				NULL != e->object; 
+				++e)
+			e->hash = e0->hash;
+			e->object = e0->object;
+		}
+	}
+	free (elements0);
+}
+
+static Pointer hashCons (GC_state s, Pointer object) {
+	GC_ObjectHashElement e;
+	Bool hasIdentity;
+	Word32 hash;
+	word header;
+	word *max;
+	uint numPointers;
+	uint numNonPointers;
+	word *p;
+	GC_ObjectHashTable t;
+	uint tag;
+
+	t = s->objectHashTable;
+	if (DEBUG_SHARE)
+		fprintf (stderr, "hashCons (0x%08x)\n", (uint)object);
+	header = GC_getHeader (object);
+	SPLIT_HEADER();
+	if (hasIdentity) {
+		/* Don't hash cons. */
+		if (DEBUG_SHARE)
+			fprintf (stderr, "hasIdentity\n");
+		return object;
+	}
+	/* Compute the hash. */
+	max = (word*)(object + toBytes (numPointers + numNonPointers));
+	hash = header;
+	for (p = (word*)object; p < max; ++p)
+		hash = hash * 31 + *p;
+	/* Look in the table. */
+	e = &t->elements[hash % t->elementsSize];
+look:
+	if (NULL == e->object) {
+		/* It's not in the table.  Add it. */
+		assert (NULL == e->object);
+		e->hash = hash;
+		e->object = object;
+		t->numElements++;
+		/* Maybe grow the table. */
+		if (t->numElements * 2 > t->elementsSize)
+			tableGrow (s->objectHashTable);
+		if (DEBUG_SHARE)
+			fprintf (stderr, "0x%08x = hashCons (0x%08x)\n", 
+					(uint)object, (uint)object);
+		return object;
+	}
+	if (hash == e->hash) {
+		Header header2;
+		word *p2;
+
+		if (DEBUG_SHARE)
+			fprintf (stderr, "comparing 0x%08x to 0x%08x\n",
+					(uint)object, (uint)e->object);
+		/* Compare object to e->object. */
+		unless (object == e->object) {
+			header2 = GC_getHeader (e->object);
+			unless (header == header2) {
+				++e; 
+				goto look;
+			}
+			for (p = (word*)object, p2 = (word*)e->object; 
+					p < max; 
+					++p, ++p2)
+				unless (*p == *p2) {
+					++e;
+					goto look;
+				}
+		}
+		/* object is equal to e->object. */
+		if (DEBUG_SHARE)
+			fprintf (stderr, "0x%08x = hashCons (0x%08x)\n", 
+					(uint)e->object, (uint)object);
+		return e->object;
+	}
+	assert (FALSE);
+	return NULL; /* quell gcc warning. */
+}
+
+static inline void maybeSharePointer (GC_state s, Pointer *pp) {
+	unless (s->shouldHashCons)
+		return;
+	if (DEBUG_SHARE)
+		fprintf (stderr, "maybeSharePointer  pp = 0x%08x  *pp = 0x%08x\n",
+				(uint)pp, (uint)*pp);
+	*pp = hashCons (s, *pp);	
+}
+
+/* ---------------------------------------------------------------- */
 /*                       Depth-first Marking                        */
 /* ---------------------------------------------------------------- */
 
@@ -1949,9 +2087,9 @@
 	return (MARK_MODE == m) ? isMarked (p): not isMarked (p);
 }
 
-/* mark (s, p) sets all the mark bits in the object graph pointed to by p. 
- * If the mode is MARK, it sets the bits to 1.
- * If the mode is UNMARK, it sets the bits to 0.
+/* mark (s, p, m) sets all the mark bits in the object graph pointed to by p. 
+ * If m is MARK_MODE, it sets the bits to 1.
+ * If m is UNMARK_MODE, it sets the bits to 0.
  *
  * It returns the total size in bytes of the objects marked.
  */
@@ -1959,6 +2097,7 @@
 	uint arrayIndex;
 	pointer cur;  /* The current object being marked. */
 	GC_offsets frameOffsets;
+	Bool hasIdentity;
 	Header* headerp;
 	Header header;
 	uint index;
@@ -2030,6 +2169,8 @@
 			fprintf (stderr, "markInNormal  index = %d\n", index);
 		if (todo == max) {
 			*headerp = header & ~COUNTER_MASK;
+			if (s->shouldHashCons)
+				cur = hashCons (s, cur);
 			goto ret;
 		}
 		next = *(pointer*)todo;
@@ -2042,8 +2183,10 @@
 		nextHeaderp = GC_getHeaderp (next);
 		nextHeader = *nextHeaderp;
 		if ((nextHeader & MARK_MASK)
-			== (MARK_MODE == mode ? MARK_MASK : 0))
+			== (MARK_MODE == mode ? MARK_MASK : 0)) {
+			maybeSharePointer (s, (pointer*)todo);
 			goto markNextInNormal;
+		}
 		*headerp = (header & ~COUNTER_MASK) |
 				(index << COUNTER_SHIFT);
 		headerp = nextHeaderp;
@@ -2090,8 +2233,10 @@
 			nextHeaderp = GC_getHeaderp (next);
 			nextHeader = *nextHeaderp;
 			if ((nextHeader & MARK_MASK)
-				== (MARK_MODE == mode ? MARK_MASK : 0))
+				== (MARK_MODE == mode ? MARK_MASK : 0)) {
+				maybeSharePointer (s, (pointer*)todo);
 				goto markArrayContinue;
+			}
 			/* Recur and mark next. */
 			*arrayCounterp (cur) = arrayIndex;
 			*headerp = (header & ~COUNTER_MASK) |
@@ -2157,6 +2302,7 @@
 		if ((nextHeader & MARK_MASK)
 			== (MARK_MODE == mode ? MARK_MASK : 0)) {
 			index++;
+			maybeSharePointer (s, (pointer*)todo);
 			goto markInFrame;
 		}
 		((GC_stack)cur)->markIndex = index;		
@@ -2223,6 +2369,28 @@
 	assert (FALSE);
 }
 
+void GC_share (GC_state s, Pointer object) {
+	if (DEBUG_SHARE)
+		fprintf (stderr, "GC_share 0x%08x\n", (uint)object);
+	mark (s, object, MARK_MODE);
+	s->shouldHashCons = TRUE;
+	s->objectHashTable = newTable ();
+	mark (s, object, UNMARK_MODE);
+	destroyTable (s->objectHashTable);
+	s->shouldHashCons = FALSE;
+}
+
+//static inline void shareGlobal (GC_state s, pointer *pp) {
+//	mark (s, pp, MARK_MODE)
+//}
+
+void GC_shareAll (GC_state s) {
+	if (DEBUG_SHARE)
+		fprintf (stderr, "GC_shareAll\n");
+	die ("GC_shareAll unimplemented\n");
+//	foreachGlobal (s, shareGlobal);
+}
+
 /* ---------------------------------------------------------------- */
 /*                 Jonkers Mark-compact Collection                  */
 /* ---------------------------------------------------------------- */
@@ -2250,6 +2418,7 @@
  * then clear the object pointer.
  */
 static inline void maybeClearWeak (GC_state s, pointer p) {
+	Bool hasIdentity;
 	Header header;
 	Header *headerp;
 	uint numPointers;
@@ -2762,7 +2931,7 @@
 
 static void majorGC (GC_state s, W32 bytesRequested, bool mayResize) {
 	s->numMinorsSinceLastMajor = 0;
-        if (not FORCE_MARK_COMPACT
+        if ((not FORCE_MARK_COMPACT)
  		and s->heap.size < s->ram
 		and (not heapIsInit (&s->heap2)
 			or heapAllocateSecondSemi (s, heapDesiredSize (s, (W64)s->bytesLive + bytesRequested, 0))))
@@ -2842,11 +3011,11 @@
 	if (needGCTime (s))
 		startTiming (&ru_start);
 	minorGC (s);
-	stackTopOk = mutatorStackInvariant(s);
+	stackTopOk = mutatorStackInvariant (s);
 	stackBytesRequested =
 		stackTopOk
 		? 0 
-		: stackBytes (s, max(2 * s->currentThread->stack->reserved, 
+		: stackBytes (s, max (2 * s->currentThread->stack->reserved, 
 					stackNeedsReserved (s, s->currentThread->stack)));
 	totalBytesRequested = 
 		(W64)oldGenBytesRequested 
@@ -3021,15 +3190,16 @@
 
 pointer GC_arrayAllocate (GC_state s, W32 ensureBytesFree, W32 numElts, 
 				W32 header) {
-	uint numPointers;
-	uint numNonPointers;
-	uint tag;
-	uint eltSize;
 	W64 arraySize64;
 	W32 arraySize;
+	uint eltSize;
 	W32 *frontier;
+	Bool hasIdentity;
 	W32 *last;
+	uint numPointers;
+	uint numNonPointers;
 	pointer res;
+	uint tag;
 
 	SPLIT_HEADER();
 	eltSize = numPointers * POINTER_SIZE + numNonPointers;
@@ -4230,6 +4400,7 @@
 	s->pageSize = getpagesize ();
 	s->ramSlop = 0.5;
 	s->savedThread = BOGUS_THREAD;
+	s->shouldHashCons = FALSE;
 	s->signalHandler = BOGUS_THREAD;
 	s->signalIsPending = FALSE;
 	s->startTime = currentTime ();



1.80      +26 -0     mlton/runtime/gc.h

Index: gc.h
===================================================================
RCS file: /cvsroot/mlton/mlton/runtime/gc.h,v
retrieving revision 1.79
retrieving revision 1.80
diff -u -r1.79 -r1.80
--- gc.h	26 Aug 2004 03:54:39 -0000	1.79
+++ gc.h	31 Aug 2004 04:56:40 -0000	1.80
@@ -97,6 +97,7 @@
 typedef struct {
 	/* Keep tag first, at zero offset, since it is referenced most often. */
 	GC_ObjectTypeTag tag;
+	Bool hasIdentity;
 	ushort numNonPointers;
 	ushort numPointers;
 } GC_ObjectType;
@@ -148,6 +149,23 @@
 } GC_frameLayout;
 
 /* ------------------------------------------------- */
+/*                   hash consing                    */
+/* ------------------------------------------------- */
+
+typedef Word32 Hash;
+
+typedef struct GC_ObjectHashElement {
+	Hash hash;
+	Pointer object;
+} *GC_ObjectHashElement;
+
+typedef struct GC_ObjectHashTable {
+	struct GC_ObjectHashElement *elements;
+	int elementsSize;
+	int numElements;
+} *GC_ObjectHashTable;
+
+/* ------------------------------------------------- */
 /*                     GC_stack                      */
 /* ------------------------------------------------- */
 
@@ -405,6 +423,7 @@
 	 */
 	float nurseryRatio;
 	pointer nursery;
+	GC_ObjectHashTable objectHashTable;
 	GC_ObjectType *objectTypes; /* Array of object types. */
 	uint objectTypesSize;
 	/* Arrays larger than oldGenArraySize are allocated in the old generation
@@ -444,6 +463,7 @@
 	 * signal handler.
 	 */
 	sigset_t signalsPending;
+	Bool shouldHashCons;
 	struct GC_sourceLabel *sourceLabels;
 	uint sourceLabelsSize;
 	/* sourcesNames is an array of strings identifying source positions. */
@@ -653,6 +673,12 @@
 
 /* Return a serialized version of the object rooted at root. */
 /* pointer GC_serialize(GC_state s, pointer root); */
+
+/* GC_share maximizes sharing in a single object. */
+void GC_share (GC_state s, Pointer object);
+
+/* GC_share maximizes sharing in the entire heap. */
+void GC_shareAll (GC_state s);
 
 /* Return the amount of heap space taken by the object pointed to by root. */
 uint GC_size (GC_state s, pointer root);



1.1                  mlton/runtime/basis/MLton/share.c

Index: share.c
===================================================================
#include "platform.h"

extern struct GC_state gcState;

void MLton_share (Pointer p) {
	GC_share (&gcState, p);
}

void MLton_shareAll () {
	GC_shareAll (&gcState);
}