I need a quick checksum (as fast as possilbe) for small strings (20-500 chars).
I need the source code and that must be small! (about 100 LOC max)
If it could generate strings in Base32/64. (or something similar) it would be perfect. Basically the checksums cannot use any "bad" chars.. you know.. the usual (){}[].,;:/+-\| etc
Clarifications
It could be strong/weak, that really doesn't matter since it is only for behind-the-scenes purposes.
It need not contain all the data of the original string since I will be only doing comparison with generated checksums, I don't expect any sort of "decryption".
For years MD5 was the fastest and most secure checksum available. Although xxHash is becoming more widely used there are still many companies that require the MD5 checksum for data integrity.
The checksum should be alphanumeric. The strings are unicode. The strings are actually texts that should be translated and the checksum is stored with each translation (so a translated text can be matched back to the original text). The length of the checksum is not important for me (the shorter, the better)
One algorithm, SHA-1, produces a 160-bit checksum and is the best-performing checksum, followed by the 256-bit and 512-bit versions. Checksums play an important role in data protection and file security.
schnaader's implementation is indeed very fast. Here it is in Javascript:
function checksum(s) { var chk = 0x12345678; var len = s.length; for (var i = 0; i < len; i++) { chk += (s.charCodeAt(i) * (i + 1)); } return (chk & 0xffffffff).toString(16); }
Using Google Chrome, this function takes just 5ms to run for 1-megabyte strings, versus 330ms using a crc32 function.
Quick implementation in C, no copyrights from my side, so use it as you wish. But please note that this is a very weak "checksum", so don't use it for serious things :) - but that's what you wanted, isn't it?
This returns an 32-bit integer checksum encoded as an string containing its hex value. If the checksum function doesn't satisfy your needs, you can change the chk += ((int)(str[i]) * (i + 1));
line to something better (f.e. multiplication, addition and bitwise rotating would be much better).
EDIT: Following hughdbrown's advice and one of the answers he linked, I changed the for
loop so it doesn't call strlen
with every iteration.
#include <stdio.h> #include <stdlib.h> #include <string> char* hextab = "0123456789ABCDEF"; char* encode_int(int i) { char* c = (char*)malloc(sizeof(char) * 9); for (int j = 0; j < 4; j++) { c[(j << 1)] = hextab[((i % 256) >> 4)]; c[(j << 1) + 1] = hextab[((i % 256) % 16)]; i = (i >> 8); } c[8] = 0; return c; } int checksum(char* str) { int i; int chk = 0x12345678; for (i = 0; str[i] != '\0'; i++) { chk += ((int)(str[i]) * (i + 1)); } return chk; } int main() { char* str1 = "Teststring"; char* str2 = "Teststring2"; printf("string: %s, checksum string: %s\n", str1, encode_int(checksum(str1))); printf("string: %s, checksum string: %s\n", str2, encode_int(checksum(str2))); return 0; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With