Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checksum for an integer array?

I have an array that is of size 4,9,16 or 25 (according to the input) and the numbers in the array are the same but less by one (if the array size is 9 then the biggest element in the array would be 8) the numbers start with 0 and I would like to do some algorithm to generate some sort of a checksum for the array so that I can compare that 2 arrays are equal without looping through the whole array and checking each element one by one.

Where can I get this sort of information? I need something that is as simple as possible. Thank you.

edit: just to be clear on what I want:

-All the numbers in the array are distinct, so [0,1,1,2] is not valid because there is a repeated element (1)

-The position of the numbers matter, so [0,1,2,3] is not the same as [3,2,1,0]

-The array will contain the number 0, so this should also be taken into consideration.

EDIT:

Okay I tried to implement the Fletcher's algorithm here: http://en.wikipedia.org/wiki/Fletcher%27s_checksum#Straightforward

int fletcher(int array[], int size){
  int i;
  int sum1=0;
  int sum2=0;
  for(i=0;i<size;i++){
    sum1=(sum1+array[i])%255;
    sum2=(sum2+sum1)%255;
  }
  return (sum2 << 8) | sum1;
}

to be honest I have no idea what does the return line do but unfortunately, the algorithm does not work. For arrays [2,1,3,0] and [1,3,2,0] I get the same checksum.

EDIT2:

okay here's another one, the Adler checksum http://en.wikipedia.org/wiki/Adler-32#Example_implementation

#define MOD 65521;

unsigned long adler(int array[], int size){
  int i;
  unsigned long a=1;
  unsigned long b=0;
  for(i=0;i<size;i++){
    a=(a+array[i])%MOD;
    b=(b+a)%MOD;
  }
  return (b <<16) | a;
}

This also does not work. Arrays [2,0,3,1] and [1,3,0,2] generate same checksum. I'm losing hope here, any ideas?

like image 1000
MinaHany Avatar asked Jun 05 '12 06:06

MinaHany


2 Answers

Let's take the case of your array of 25 integers. You explain that it can contains any permutations of the unique integers 0 to 24. According to this page, there is 25! (25 factorial) possible permutations, that is 15511210043330985984000000. Far more than a 32bit integer can contains.

The conclusion is that you will have collision, no matter how hard you try.

Now, here is a simple algorithm that account for position:

int checksum(int[] array, int size) {
  int c = 0;
  for(int i = 0; i < size; i++) {
    c += array[i];
    c = c << 3 | c >> (32 - 3); // rotate a little
    c ^= 0xFFFFFFFF; // invert just for fun
  }
  return c;
}
like image 54
Nicolas Repiquet Avatar answered Sep 18 '22 21:09

Nicolas Repiquet


I think what you want is in the answer of the following thread:

Fast permutation -> number -> permutation mapping algorithms

You just take the number your permutation is mapped to and take that as your Checksum. As there is exactly one Checksum per permutation there can't be a smaller Checksum that is collision free.

like image 42
Philosophus42 Avatar answered Sep 19 '22 21:09

Philosophus42