Pergunta

I was experimenting with unsafeCoerce with Int8 and Word8, and I found some surprising behaviour (for me anyway).

Word8 is a 8 bit unsigned number that ranges from 0-255. Int8 is a signed 8 bit number that ranges from -128..127.

Since they are both 8 bit numbers, I assumed that coercing one to another would be safe, and just return the 8 bit values as if it was signed/unsigned.

For example, unsafeCoerce (-1 :: Int8) :: Word8 I would expect to result in a Word8 value of 255 (since the bit representation of -1 in a signed int is the same as 255 in an unsigned int).

However, when I do perform the coerce, the Word8 the behaviour is strange:

> GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
> import Data.Int
> import Data.Word
> import Unsafe.Coerce
> class ShowType a where typeName :: a -> String
> instance ShowType Int8 where typeName _ = "Int8"
> instance ShowType Word8 where typeName _ = "Word8"

> let x = unsafeCoerce (-1 :: Int8) :: Word8
> show x
"-1"
> typeName x
"Word8"
> show (x + 0)
"255"
> :t x
x :: Word8
> :t (x + 0)
(x + 0) :: Word8

I don't understand how show x is returning "-1" here. If you look at map show [minBound..maxBound :: Word8], no possible value for Word8 results in "-1". Also, how does adding 0 to the number change the behaviour, even if the type isn't changed? Strangely, it also appears it is only the Show class that is affected - my ShowType class returns the correct value.

Finally, the code fromIntegral (-1 :: Int8) :: Word8 works as expected, and returns 255, and works correctly with show. Is/can this code be reduced to a no-op by the compiler?

Note that this question is just out of curiosity about how types are represented in ghc at a low level. I'm not actually using unsafeCoerce in my code.

Foi útil?

Solução

Like @kosmikus said, both Int8 and Int16 are implemented using an Int#, which is 32 bit-wide on 32-bit architectures (and Word8 and Word16 are Word# under the hood). This comment in GHC.Prim explains this in more detail.

So let's find out why this implementation choice results in the behaviour you see:

> let x = unsafeCoerce (-1 :: Int8) :: Word8
> show x
"-1"

The Show instance for Word8 is defined as

instance Show Word8 where
    showsPrec p x = showsPrec p (fromIntegral x :: Int)

and fromIntegral is just fromInteger . toInteger. The definition of toInteger for Word8 is

toInteger (W8# x#)            = smallInteger (word2Int# x#)

where smallInteger (defined in integer-gmp) is

smallInteger :: Int# -> Integer
smallInteger i = S# i

and word2Int# is a primop with type Word# -> Int# - an analog of reinterpret_cast<int> in C++. So that explains why you see -1 in the first example: the value is just reinterpreted as a signed integer and printed out.

Now, why would adding 0 to x give you 255? Looking at the Num instance for Word8 we see this:

(W8# x#) + (W8# y#)    = W8# (narrow8Word# (x# `plusWord#` y#))

So it looks like the narrow8Word# primop is the culprit. Let's check:

> import GHC.Word
> import GHC.Prim
> case x of (W8# w) -> (W8# (narrow8Word# w))
255

Indeed it is. That explains why adding 0 is not a no-op - Word8 addition actually clamps down the value to the intended range.

Outras dicas

You can't say something is wrong when you've used unsafeCoerce. Anything can happen if you use that function. The compiler probably stores an Int8 in a word, and using unsafeCoerce to Word8 breaks the invariants on what is stored in this word. Use fromIntegral to convert.

Conversion from Int8 to Word8 using fromIntegral turns into a movzbl instruction using ghc on x86, which is basically a no-op.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top