[MLton] power pc "port"
Matthew Fluet
fluet@cs.cornell.edu
Sun, 5 Sep 2004 09:12:55 -0400 (EDT)
> > BTW, feel free to add comments to the code in places like this where
> > you learned stuff that wasn't clear initially. I have no problem with
> > checking in patches containing only comments, and it'll make it easier
> > for our rapidly growing developer base.
>
> I added a comment explaining how fmt works.
>
> By the way, I found the problem. The problem is that the negation
> function (Word8_neg, Word16_neg) is declared to operate over unsigned
> words. This makes GCC decide to compile Word8_neg as follows:
>
> neg r3,r3 ; negate the argument
> rlwinm r3,r3,0,0xff ; mask off lower 8 bits
>
> Likewise, Word16_neg is compiled as follows:
>
> neg r3,r3 ; negate the argument
> rlwinm r3,r3,0,0xffff ; mask off lower 16 bits
>
> What does this mean? Since my PowerPC only has 32-bit registers (other
> PowerPCs may only have 64-bit registers, which leads to even more fun),
> 8-bit signed and 16-bit signed words get stuffed into 32-bit registers
> with sign extension. Ordinarily, the compiler will use such a combination
> of instructions when dealing with these 8-bit and 16-bit words that the
> sign extension is not lost. However, in the above case, the compiler has
> decided to blow away the sign extension quite blatantly. Why? Because
> the declared return type, as well as the argument type, is unsigned. By
> my reading of section 6.5.3.3 of the C standard, in order to do negation
> on an unsigned, it must first get 'promoted' to a signed type. But then,
> because the return type is unsigned, the result of the negation must be
> converted back to unsigned. And it is this conversion back to unsigned
> that gets compiled as a masking operation that blows away the sign
> extension.
Henry and/or Stephen can probably chime in with more info, but is GCC
really right here? Word8_neg is specifying an 8-bit negation -- a
perfectly well defined operation which given 8 bits returns the 8-bit
negation. While it may be the case that the implementation needs to side
step through 32-bit operations, why can the end result look any different
than what would happen under an 8-bit machine?