Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CRC32 calculation in Python without using libraries

I have been trying to get my head around CRC32 calculations without much success, the values that I seem to get do not match what I should get.

I am aware that Python has libraries that are capable of generating these checksums (namely zlib and binascii) but I do not have the luxury of being able to use them as the CRC functionality do not exist on the micropython.

So far I have the following code:

import binascii
import zlib
from array import array

poly = 0xEDB88320

table = array('L')
for byte in range(256):
    crc = 0
    for bit in range(8):
        if (byte ^ crc) & 1:
            crc = (crc >> 1) ^ poly
        else:
            crc >>= 1
        byte >>= 1
    table.append(crc)

def crc32(string):
    value = 0xffffffffL

    for ch in string:
        value = table[(ord(ch) ^ value) & 0x000000ffL] ^ (value >> 8)

    return value

teststring = "test"

print "binascii calc:  0x%08x" % (binascii.crc32(teststring) & 0xffffffff)
print "zlib calc:      0x%08x" % (zlib.crc32(teststring) & 0xffffffff)
print "my calc:        0x%08x" % (crc32(teststring))

Then I get the following output:

binascii calc:  0xd87f7e0c
zlib calc:      0xd87f7e0c
my calc:        0x2780810c

The binascii and zlib calculations agree where as my one doesn't. I believe the calculated table of bytes is correct as I have compared it to examples available on the net. So the issue must be the routine where each byte is calculated, could anyone point me in the correct direction?

Thanks in advance!

like image 990
Cooper Avatar asked Jan 10 '17 09:01

Cooper


People also ask

How does Python calculate CRC checksum?

crc32() method, we can compute the checksum for crc32 (Cyclic Redundancy Check) to a particular data. It will give 32-bit integer value as a result by using zlib. crc32() method. Return : Return the unsigned 32-bit checksum integer.

What is crc32 Python?

crc32 computes CRC-32 on binary text data. By definition, CRC-32 refers to the 32-bit checksum of any piece of data. This method is used to compute 32-bit checksum of provided data. This algorithm is not a general hash algorithm. It should only be used as a checksum algorithm.

What is crc32 in zlib?

zlib. crc32 (data[, value]) Computes a CRC (Cyclic Redundancy Check) checksum of data. The result is an unsigned 32-bit integer. If value is present, it is used as the starting value of the checksum; otherwise, a default value of 0 is used.


1 Answers

I haven't looked closely at your code, so I can't pinpoint the exact source of the error, but you can easily tweak it to get the desired output:

import binascii
from array import array

poly = 0xEDB88320

table = array('L')
for byte in range(256):
    crc = 0
    for bit in range(8):
        if (byte ^ crc) & 1:
            crc = (crc >> 1) ^ poly
        else:
            crc >>= 1
        byte >>= 1
    table.append(crc)

def crc32(string):
    value = 0xffffffffL
    for ch in string:
        value = table[(ord(ch) ^ value) & 0xff] ^ (value >> 8)

    return -1 - value

# test

data = (
    '',
    'test',
    'hello world',
    '1234',
    'A long string to test CRC32 functions',
)

for s in data:
    print repr(s)
    a = binascii.crc32(s)
    print '%08x' % (a & 0xffffffffL)
    b = crc32(s)
    print '%08x' % (b & 0xffffffffL)
    print

output

''
00000000
00000000

'test'
d87f7e0c
d87f7e0c

'hello world'
0d4a1185
0d4a1185

'1234'
9be3e0a3
9be3e0a3

'A long string to test CRC32 functions'
d2d10e28
d2d10e28

Here are a couple more tests that verify that the tweaked crc32 gives the same result as binascii.crc32.

from random import seed, randrange

print 'Single byte tests...',
for i in range(256):
        s = chr(i)
        a = binascii.crc32(s) & 0xffffffffL
        b = crc32(s) & 0xffffffffL
        assert a == b, (repr(s), a, b)

print('ok')

seed(42)

print 'Multi-byte tests...'
for width in range(2, 20):
    print 'Width', width
    r = range(width)
    for n in range(1000):
        s = ''.join([chr(randrange(256)) for i in r])
        a = binascii.crc32(s) & 0xffffffffL
        b = crc32(s) & 0xffffffffL
        assert a == b, (repr(s), a, b)
print('ok')

output

Single byte tests... ok
Multi-byte tests...
Width 2
Width 3
Width 4
Width 5
Width 6
Width 7
Width 8
Width 9
Width 10
Width 11
Width 12
Width 13
Width 14
Width 15
Width 16
Width 17
Width 18
Width 19
ok

As discussed in the comments, the source of the error in the original code is that this CRC-32 algorithm inverts the initial crc buffer, and then inverts the final buffer contents. So value is initialised to 0xffffffff instead of zero, and we need to return value ^ 0xffffffff, which can also be written as ~value & 0xffffffff, i.e. invert value and then select the low-order 32 bits of the result.

like image 59
PM 2Ring Avatar answered Oct 28 '22 04:10

PM 2Ring