Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast open source checksum for small strings

Tags:

I need a quick checksum (as fast as possilbe) for small strings (20-500 chars).

I need the source code and that must be small! (about 100 LOC max)

If it could generate strings in Base32/64. (or something similar) it would be perfect. Basically the checksums cannot use any "bad" chars.. you know.. the usual (){}[].,;:/+-\| etc

Clarifications

It could be strong/weak, that really doesn't matter since it is only for behind-the-scenes purposes.

It need not contain all the data of the original string since I will be only doing comparison with generated checksums, I don't expect any sort of "decryption".

like image 367
Robin Rodricks Avatar asked May 01 '09 12:05

Robin Rodricks


People also ask

What is the fastest checksum?

For years MD5 was the fastest and most secure checksum available. Although xxHash is becoming more widely used there are still many companies that require the MD5 checksum for data integrity.

How do you write a checksum for a string?

The checksum should be alphanumeric. The strings are unicode. The strings are actually texts that should be translated and the checksum is stored with each translation (so a translated text can be matched back to the original text). The length of the checksum is not important for me (the shorter, the better)

What is a good checksum?

One algorithm, SHA-1, produces a 160-bit checksum and is the best-performing checksum, followed by the 256-bit and 512-bit versions. Checksums play an important role in data protection and file security.


2 Answers

schnaader's implementation is indeed very fast. Here it is in Javascript:

function checksum(s) {   var chk = 0x12345678;   var len = s.length;   for (var i = 0; i < len; i++) {       chk += (s.charCodeAt(i) * (i + 1));   }    return (chk & 0xffffffff).toString(16); } 

Using Google Chrome, this function takes just 5ms to run for 1-megabyte strings, versus 330ms using a crc32 function.

like image 86
joelpt Avatar answered Nov 14 '22 07:11

joelpt


Quick implementation in C, no copyrights from my side, so use it as you wish. But please note that this is a very weak "checksum", so don't use it for serious things :) - but that's what you wanted, isn't it?

This returns an 32-bit integer checksum encoded as an string containing its hex value. If the checksum function doesn't satisfy your needs, you can change the chk += ((int)(str[i]) * (i + 1)); line to something better (f.e. multiplication, addition and bitwise rotating would be much better).

EDIT: Following hughdbrown's advice and one of the answers he linked, I changed the for loop so it doesn't call strlen with every iteration.

#include <stdio.h> #include <stdlib.h> #include <string>  char* hextab = "0123456789ABCDEF";  char* encode_int(int i) {   char* c = (char*)malloc(sizeof(char) * 9);    for (int j = 0; j < 4; j++) {     c[(j << 1)] = hextab[((i % 256) >> 4)];     c[(j << 1) + 1] = hextab[((i % 256) % 16)];      i = (i >> 8);   }   c[8] = 0;    return c; }  int checksum(char* str) {   int i;   int chk = 0x12345678;    for (i = 0; str[i] != '\0'; i++) {     chk += ((int)(str[i]) * (i + 1));   }    return chk; }  int main() {   char* str1 = "Teststring";   char* str2 = "Teststring2";    printf("string: %s, checksum string: %s\n", str1, encode_int(checksum(str1)));   printf("string: %s, checksum string: %s\n", str2, encode_int(checksum(str2)));    return 0; } 
like image 25
schnaader Avatar answered Nov 14 '22 09:11

schnaader