I have a simple test runner for the bug which is in my OpenPGP module https://github.com/singpolyma/OpenPGP-Haskell/blob/master/Data/OpenPGP.hs:
module Main where
import Data.OpenPGP
import Data.Binary (encode, decode)
packet = EmbeddedSignaturePacket (signaturePacket 2 168 ECDSA SHA256 [] [SignatureCreationTimePacket 1013401916,IssuerPacket "36FE856F4219F1C7"] 48065 [MPI 4,MPI 11,MPI 60,MPI 69,MPI 37,MPI 33,MPI 18,MPI 72,MPI 41,MPI 36,MPI 43,MPI 41,MPI 53,MPI 9,MPI 53,MPI 35,MPI 3,MPI 40,MPI 14,MPI 79,MPI 1,MPI 4,MPI 51,MPI 23,MPI 62,MPI 62,MPI 62,MPI 7,MPI 68,MPI 51,MPI 13,MPI 49,MPI 8,MPI 64,MPI 32,MPI 50,MPI 59,MPI 17,MPI 43,MPI 12,MPI 67,MPI 5,MPI 67,MPI 5,MPI 25,MPI 63,MPI 0,MPI 53,MPI 2,MPI 36,MPI 83,MPI 39,MPI 54,MPI 65,MPI 54,MPI 35,MPI 62,MPI 63,MPI 26,MPI 4,MPI 82,MPI 57,MPI 85,MPI 71,MPI 43,MPI 77])
main = print $ decode (encode packet) == packet
If you compile this (on ghc 7.4.1) with:
ghc -O0 -fforce-recomp --make t.hs
It works as expected (that is, it prints True
), but if you compile like this:
ghc -O1 -fforce-recomp --make t.hs
or this:
ghc -O2 -fforce-recomp --make t.hs
It will print False
.
I'm not using any extensions (except a trivial use of CPP) or low-level or unsafe calls, and the behaviour should be from my library and not a dependency, since it's only my code that is getting recompiled here.
The code optimization in the synthesis phase is a program transformation technique, which tries to improve the intermediate code by making it consume fewer resources (i.e. CPU, Memory) so that faster-running machine code will result.
We say that code optimization is writing or rewriting code so a program uses the least possible memory or disk space, minimizes its CPU time or network bandwidth, or makes the best use of additional cores. In practice, we sometimes default to another definition: Writing less code.
By optimizing an executable, you can achieve a balance between fast execution speed and small code size. This topic discusses some of the mechanisms that Visual Studio provides to help you optimize code.
It's an error in your code. Consider
MPI 63,MPI 0,MPI 53
^^^^^
and
instance BINARY_CLASS MPI where
put (MPI i) = do
put (((fromIntegral . B.length $ bytes) - 1) * 8
+ floor (logBase (2::Double) $ fromIntegral (bytes `B.index` 0))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ 1 :: Word16)
putSomeByteString bytes
where
bytes = if B.null bytes' then B.singleton 0 else bytes'
bytes' = B.reverse $ B.unfoldr (\x ->
if x == 0 then Nothing else
Just (fromIntegral x, x `shiftR` 8)
) (assertProp (>=0) i)
Now, if we encode MPI 0
, bytes'
is empty, thus bytes = B.singleton 0
and hence bytes `B.index` 0
is 0.
But logBase 2 0
is -Infinity
, and floor
is only well-defined for finite values (within the range of the target type).
When compiling without optimisations, floor
uses the bit-pattern via decodeFloat
. Then floor (logBase 2 0)
yields 0 for all standard fixed-width integer types.
With optimisations, a rewrite-rule is active and floor
uses the primop double2Int#
, which returns whatever the hardware does, on x86 resp. x86-64, that's minBound :: Int
, as far as I know, regardless of the bit-pattern. The relevant code is
floorDoubleInt :: Double -> Int
floorDoubleInt (D# x) =
case double2Int# x of
n | x <## int2Double# n -> I# (n -# 1#)
| otherwise -> I# n
and of course, -Infinity < int2Double minBound
, so the value becomes minBound - 1
, which usually is maxBound
.
Of course that causes a wrong result, since now the "length" that is put
for MPI 0
becomes 0, and the 0 byte put after the "length" field is interpreted as part of the "length" of the next MPI
.
The problem is related to your BINARY_CLASS instance for MPI
. If I change
main = do
print packet
print (decode (encode packet) :: SignatureSubpacket)
print $ decode (encode packet) == packet
I see the output (compiled with -O2)
EmbeddedSignaturePacket (SignaturePacket {version = 2, signature_type = 168, key_algorithm = ECDSA, hash_algorithm = SHA256, hashed_subpackets = [], unhashed_subpackets = [SignatureCreationTimePacket 1013401916,IssuerPacket "36FE856F4219F1C7"], hash_head = 48065, signature = [MPI 4,MPI 11,MPI 60,MPI 69,MPI 37,MPI 33,MPI 18,MPI 72,MPI 41,MPI 36,MPI 43,MPI 41,MPI 53,MPI 9,MPI 53,MPI 35,MPI 3,MPI 40,MPI 14,MPI 79,MPI 1,MPI 4,MPI 51,MPI 23,MPI 62,MPI 62,MPI 62,MPI 7,MPI 68,MPI 51,MPI 13,MPI 49,MPI 8,MPI 64,MPI 32,MPI 50,MPI 59,MPI 17,MPI 43,MPI 12,MPI 67,MPI 5,MPI 67,MPI 5,MPI 25,MPI 63,MPI 0,MPI 53,MPI 2,MPI 36,MPI 83,MPI 39,MPI 54,MPI 65,MPI 54,MPI 35,MPI 62,MPI 63,MPI 26,MPI 4,MPI 82,MPI 57,MPI 85,MPI 71,MPI 43,MPI 77], trailer = Chunk "\168" (Chunk "<gI<" Empty)})
EmbeddedSignaturePacket (SignaturePacket {version = 2, signature_type = 168, key_algorithm = ECDSA, hash_algorithm = SHA256, hashed_subpackets = [], unhashed_subpackets = [SignatureCreationTimePacket 1013401916,IssuerPacket "36FE856F4219F1C7"], hash_head = 48065, signature = [MPI 4,MPI 11,MPI 60,MPI 69,MPI 37,MPI 33,MPI 18,MPI 72,MPI 41,MPI 36,MPI 43,MPI 41,MPI 53,MPI 9,MPI 53,MPI 35,MPI 3,MPI 40,MPI 14,MPI 79,MPI 1,MPI 4,MPI 51,MPI 23,MPI 62,MPI 62,MPI 62,MPI 7,MPI 68,MPI 51,MPI 13,MPI 49,MPI 8,MPI 64,MPI 32,MPI 50,MPI 59,MPI 17,MPI 43,MPI 12,MPI 67,MPI 5,MPI 67,MPI 5,MPI 25,MPI 63,MPI 0,MPI 0,MPI 339782829898145924110968965855122255180100961470274369007196973863828909184332476115285611703086303618816635857833592912611149], trailer = Chunk "\168" (Chunk "<gI<" Empty)})
Changing your MPI instance to this more straightforward implementation:
newtype MPI = MPI Integer deriving (Show, Read, Eq, Ord)
instance BINARY_CLASS MPI where
put (MPI i) = do
put (fromIntegral $ B.length bytes :: Word16)
putSomeByteString bytes
where
bytes = if B.null bytes' then B.singleton 0 else bytes'
bytes' = B.pack . map (read . (:[])) $ show i
get = do
length <- fmap fromIntegral (get :: Get Word16)
bytes <- getSomeByteString length
return (MPI $ read $ concatMap show $ B.unpack bytes)
fixes the problem.
There are a few things that could be the problem source. It's possible that your code is correct (I haven't checked this one way or the other), in which case GHC is performing some invalid transformation leading to an overflow/underflow somewhere. It's also possible that your code is doing something incorrect that is only exposed by certain optimizations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With