I'm using the following function to compute the CRC32 of a file in a VS2008, .NET 3.5 project:
public UInt32 ComputeHash(System.IO.Stream stream)
{
unchecked
{
const int BUFFER_SIZE = 1024;
UInt32 crc32Result = 0xFFFFFFFF;
byte[] buffer = new byte[BUFFER_SIZE];
int count = stream.Read(buffer, 0, BUFFER_SIZE);
while (count > 0)
{
for (int i = 0; i < count; i++)
{
crc32Result = ((crc32Result) >> 8) ^ _crc32Table[(buffer[i]) ^ (crc32Result) & _LOOKUP_TABLE_MAX_INDEX];
}
count = stream.Read(buffer, 0, BUFFER_SIZE);
}
return ~crc32Result;
}
}
For the sake of brevity, I have left out the function that builds the lookup table (_crc32Table). The table is an array of UInt32, is built when the class is instantiated, and contains 256 values (256 is also the value of _LOOKUP_TABLE_MAX_INDEX + 1).
I have run some benchmarks comparing this to the MD5CryptoServiceProvider and SHA1CryptoServiceProvider ComputeHash functions and they are much faster. The MD5 function is over twice as fast and the SHA1 hash is about 35% faster. I was told CRC32 is fast, but that's not what I'm seeing.
Am I mistaken in my assumptions? Is this to be expected or is there a flaw in this algorithm?
CRC32 is an error-detecting function that uses a CRC32 algorithm to detect changes between source and target data. The CRC32 function converts a variable-length string into an 8-character string that is a text representation of the hexadecimal value of a 32 bit-binary sequence.
There is a way to get the CRC-32 on Windows (since Win 7): Right-click the file(s) you wish to get the CRC-32 for and click Send to → Compressed (zipped) folder. Open the ZIP file using Windows Explorer, set the view to details. Right-click on the detail header and select the CRC-32 column to be visible.
CRC-32 is not a cryptographic hash and is typically used as a fast error-detecting code for detecting changes in raw data. This algorithm is known as CRC-32/BZIP2 .
You are comparing your code to built in functions and asking why they are faster. What you need to do is find the source for the built in functions. How do they work? See what's different.
Betcha the built in functions call out to a native library and cheat by not having to run inside the managed memory framework.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With