[MLton-commit] r6690

Thu Aug 7 10:09:22 PDT 2008

When the assembler gets an instruction
	movq $immediate, %rax
it needs to put the immediate into machine code. On amd64 32-bit immediates 
are sign extended to 64-bit by the processor. 64-bit immediates are possible
only with movq and cannot be used with addq/etc. MLton already knows that if
a 64 bit immediate cannot be represented as the sign extension of a 32 bit
value, it must first movq the value into a register, eg:
	movq $0x1234567890, %rax
	addq %rax, %rbx
because
	addq $0x1234567890, %rax
is invalid assembler.

The problem this patch addresses is the corner case of an immediate that is
32 bit with the high bit set. There is no 32 bit value which will be correct
after sign extension, so MLton will never generate an instruction using this
immediate without going first via a movq. However, when the assembler sees
movq with this immediate, it must decide between 64-bit and 32-bit encodings. 
The gas in debian will choose a 64-bit. The (newer) gas on win64 chooses a 
32-bit immediate. These have different results:

movq $0xFFFFFFFF, %rax --- when 32bit: %rax = 0xFFFFFFFFFFFFFFFF = ~1
movq $0xFFFFFFFF, %rax --- when 64bit: %rax = 0x00000000FFFFFFFF

Mlton always expects the assembler to produce the second case.

Fortunately, there is an easy fix which also produces smaller machine code:
	movl $3, %eax
32-bit operations on amd64 implicitly zero extend the result register. This
instruction is completely equivalent to
	movq $3, %rax
By exploiting this implicit zero extension, we can be unambiguous with movl:
	movl $0xFFFFFFFF, %eax --- always sets %rax = 0x00000000FFFFFFFF

Therefore, whenever the destination register is 64 bit, and the operand has
<= 32 bits, use a movl. This will always do what MLton expects. If the value
cannot be fit into a 32 bit immediate, output a movq with confidence that the 
assembler has no choice but the 64 bit encoding.


----------------------------------------------------------------------

U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun

----------------------------------------------------------------------

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun
===================================================================

--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun	2008-08-06 00:21:53 UTC (rev 6689)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun	2008-08-07 17:09:20 UTC (rev 6690)
@@ -4068,11 +4068,19 @@
             val _ = Int.dec depth
             val instruction
               = case Immediate.destruct immediate of
-                   Immediate.Word _ =>
-                      Assembly.instruction_mov 
-                      {dst = Operand.Register final_register,
-                       src = Operand.Immediate immediate,
-                       size = size}
+                   Immediate.Word x =>
+                      if size = Size.QUAD andalso
+                         WordX.equals (x, WordX.resize (WordX.resize (x, WordSize.word32), WordSize.word64))
+                      then (* use the implicit zero-extend of 32 bit ops *)
+                       Assembly.instruction_mov 
+                       {dst = Operand.Register (Register.lowPartOf (final_register, Size.LONG)),
+                        src = Operand.immediate_word (WordX.resize (x, WordSize.word32)),
+                        size = Size.LONG}
+                      else
+                       Assembly.instruction_mov 
+                       {dst = Operand.Register final_register,
+                        src = Operand.Immediate immediate,
+                        size = size}
                  | _ =>
                       Assembly.instruction_lea
                       {dst = Operand.Register final_register,