Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find first non-repeating element?

How to find first non-repeating element in an array. Provided that you can only use 1 bit for every element of the array and time complexity should be O(n) where n is length of array. Please make sure that I somehow imposed constraint on memory requirements. It is also possible that it can not be done with just an extra bit per element of the string. Also please let me know if it is possible or not?

like image 735
arvind.mohan Avatar asked Aug 15 '11 09:08

arvind.mohan


1 Answers

I would say there is no comparison based algorithm, that can do it in O(n). As you have to compare the the first element of the array with all others, the 2nd with all except the first, the 3rd with all except the first = Sum i = O(n^2).

(But that does not necessarily mean that there is no faster algorithm, see sorting: There is a proof that you cant sort fast than O(n log n) if you are comparison based - and there is indeed one faster: Bucket Sort, which can do it in O(n)).

EDIT: In one of the other comments I said something about hash functions. I checked some facts about it, and here are the hashmap approach thoughts:

  • Obvious approach is (in Pseudocode):

    for (i = 0; i < maxsize; i++)
        count[i] = 0;
    for (i = 0; i < maxsize; i++) {
       h = hash(A[i]);
       count[h]++;
    }
    first = -1;
    for (i = 0; i < maxsize; i++)
       if (count[i] == 0) {
          first = i;
          break;
       }
    }
    for (i = 0; hash(A[i]) != first; i++) ;
    printf("first unique: " + A[i]); 
    
  • There are some caveats:

    1. How to get hash. I did some research on perfect hash functions. And indeed you can generate one in O(n). (Optimal algorithms for minimal perfect hashing by George Havas et al. - Not sure how good this paper is, as it claims as Time Limit O(n) but speaks from non linear space limit (which is plan an error, I hope I am not the only seeing the flaw in the this, but according to all theorical computer science I know off time is an upper border for space (as you dont have time to write in more space)). But I believe them when they say it is possible in O(n).

    2. The additional space - here I dont see a solution. Above papers cites some research that says that you need 2.7 bits for the perfect hash function. With the additional count array (which you can shorten to the states: Empty + 1 Element + More than 1 Element) you need 2 additional bits per element (1.58 if you assume you can it somehow combine with the above 2.7), which sums up to additional 5 bits.

like image 189
flolo Avatar answered Oct 12 '22 00:10

flolo