Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

show returning wrong value when used with unsafeCoerced value

I was experimenting with unsafeCoerce with Int8 and Word8, and I found some surprising behaviour (for me anyway).

Word8 is a 8 bit unsigned number that ranges from 0-255. Int8 is a signed 8 bit number that ranges from -128..127.

Since they are both 8 bit numbers, I assumed that coercing one to another would be safe, and just return the 8 bit values as if it was signed/unsigned.

For example, unsafeCoerce (-1 :: Int8) :: Word8 I would expect to result in a Word8 value of 255 (since the bit representation of -1 in a signed int is the same as 255 in an unsigned int).

However, when I do perform the coerce, the Word8 the behaviour is strange:

> GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
> import Data.Int
> import Data.Word
> import Unsafe.Coerce
> class ShowType a where typeName :: a -> String
> instance ShowType Int8 where typeName _ = "Int8"
> instance ShowType Word8 where typeName _ = "Word8"

> let x = unsafeCoerce (-1 :: Int8) :: Word8
> show x
"-1"
> typeName x
"Word8"
> show (x + 0)
"255"
> :t x
x :: Word8
> :t (x + 0)
(x + 0) :: Word8

I don't understand how show x is returning "-1" here. If you look at map show [minBound..maxBound :: Word8], no possible value for Word8 results in "-1". Also, how does adding 0 to the number change the behaviour, even if the type isn't changed? Strangely, it also appears it is only the Show class that is affected - my ShowType class returns the correct value.

Finally, the code fromIntegral (-1 :: Int8) :: Word8 works as expected, and returns 255, and works correctly with show. Is/can this code be reduced to a no-op by the compiler?

Note that this question is just out of curiosity about how types are represented in ghc at a low level. I'm not actually using unsafeCoerce in my code.

like image 225
David Miani Avatar asked Apr 05 '13 08:04

David Miani


2 Answers

Like @kosmikus said, both Int8 and Int16 are implemented using an Int#, which is 32 bit-wide on 32-bit architectures (and Word8 and Word16 are Word# under the hood). This comment in GHC.Prim explains this in more detail.

So let's find out why this implementation choice results in the behaviour you see:

> let x = unsafeCoerce (-1 :: Int8) :: Word8
> show x
"-1"

The Show instance for Word8 is defined as

instance Show Word8 where
    showsPrec p x = showsPrec p (fromIntegral x :: Int)

and fromIntegral is just fromInteger . toInteger. The definition of toInteger for Word8 is

toInteger (W8# x#)            = smallInteger (word2Int# x#)

where smallInteger (defined in integer-gmp) is

smallInteger :: Int# -> Integer
smallInteger i = S# i

and word2Int# is a primop with type Word# -> Int# - an analog of reinterpret_cast<int> in C++. So that explains why you see -1 in the first example: the value is just reinterpreted as a signed integer and printed out.

Now, why would adding 0 to x give you 255? Looking at the Num instance for Word8 we see this:

(W8# x#) + (W8# y#)    = W8# (narrow8Word# (x# `plusWord#` y#))

So it looks like the narrow8Word# primop is the culprit. Let's check:

> import GHC.Word
> import GHC.Prim
> case x of (W8# w) -> (W8# (narrow8Word# w))
255

Indeed it is. That explains why adding 0 is not a no-op - Word8 addition actually clamps down the value to the intended range.

like image 187
Mikhail Glushenkov Avatar answered Nov 10 '22 18:11

Mikhail Glushenkov


You can't say something is wrong when you've used unsafeCoerce. Anything can happen if you use that function. The compiler probably stores an Int8 in a word, and using unsafeCoerce to Word8 breaks the invariants on what is stored in this word. Use fromIntegral to convert.

Conversion from Int8 to Word8 using fromIntegral turns into a movzbl instruction using ghc on x86, which is basically a no-op.

like image 4
augustss Avatar answered Nov 10 '22 18:11

augustss