Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correctness of Fletcher32 checksum algorithm

I'm having a hard time figuring out which implementation of the 32-bit variation of the Fletcher checksum algorithm is correct. Wikipedia provides the following optimized implementation:

uint32_t fletcher32( uint16_t const *data, size_t words ) {
        uint32_t sum1 = 0xffff, sum2 = 0xffff;
        size_t tlen;

        while (words) {
                tlen = words >= 359 ? 359 : words;
                words -= tlen;
                do {
                        sum2 += sum1 += *data++;
                } while (--tlen);
                sum1 = (sum1 & 0xffff) + (sum1 >> 16);
                sum2 = (sum2 & 0xffff) + (sum2 >> 16);
        }
        /* Second reduction step to reduce sums to 16 bits */
        sum1 = (sum1 & 0xffff) + (sum1 >> 16);
        sum2 = (sum2 & 0xffff) + (sum2 >> 16);
        return sum2 << 16 | sum1;
}

In addition, I've adapted the non-optimized 16-bit example from the Wikipedia article to compute a 32-bit checksum:

uint32_t naive_fletcher32(uint16_t *data, int words) {
   uint32_t sum1 = 0;
   uint32_t sum2 = 0;

   int index;
   for( index = 0; index < words; ++index ) {
      sum1 = (sum1 + data[index]) % 0xffff;
      sum2 = (sum2 + sum1) % 0xffff;
   }
   return (sum2 << 16) | sum1;
}

Both these implementations yield the same results, e.g. 0x56502d2a for the string abcdef. To verify that this is indeed correct, I tried to find other implementations of the algorithm:

  • An online checksum/hash generator
  • C++ implementation in the srecord project
  • There's also a JavaScript implementation

All of these seem to agree that the checksum for abcdef is 0x8180255 instead of the value given by the implementation on Wikipedia. I've narrowed this down to how the data buffer the implementation operates on. All the above non-wikipedia implementation operate one byte at a time, whereas the Wikipedia implementation computes the checksum using 16-bit words. If I modify the above "naive" Wikipedia implementation to operate per-byte instead, it reads like this:

uint32_t naive_fletcher32_per_byte(uint8_t *data, int words) {
   uint32_t sum1 = 0;
   uint32_t sum2 = 0;

   int index;
   for( index = 0; index < words; ++index ) {
      sum1 = (sum1 + data[index]) % 0xffff;
      sum2 = (sum2 + sum1) % 0xffff;
   }
   return (sum2 << 16) | sum1;
}

The only thing that changes is the signature, really. So this modified naive implementation and the above mentioned implementations (except Wikipedia) agree that the checksum of abcdef is indeed 0x8180255.

My problem now is: which one is correct?

like image 745
fresskoma Avatar asked Oct 26 '16 19:10

fresskoma


People also ask

How is simple checksum calculated?

The received data unit is divided into segments of 8 bits. All the segments along with the checksum value are added. Sum of all segments + Checksum value = 00100101 + 11011010 = 11111111. Complemented value = 00000000.


2 Answers

According to the standard, the right method is the one that Wikipedia provides — except the name:

Note that the 8-bit Fletcher algorithm gives a 16-bit checksum and the 16-bit algorithm gives a 32-bit checksum.

like image 154
hidefromkgb Avatar answered Sep 24 '22 05:09

hidefromkgb


These are test vectors, which are cross checked with two different implementations for 16-bit and for 32-bit check sums:

8-bit implementation (16-bit checksum)
 "abcde" -> 51440 (0xC8F0)
 "abcdef" -> 8279 (0x2057)
 "abcdefgh" -> 1575 (0x0627)

16-bit implementation (32-bit checksum)
 "abcde" -> 4031760169 (0xF04FC729)
 "abcdef" -> 1448095018 (0x56502D2A)
 "abcdefgh" -> 3957429649 (0xEBE19591)
like image 45
Boris Krassi Avatar answered Sep 22 '22 05:09

Boris Krassi