Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one remove duplicate elements in place in an array in O(n) in C or C++?

Tags:

c++

c

algorithm

Is there any method to remove the duplicate elements in an array in place in C/C++ in O(n)? Suppose elements are a[5]={1,2,2,3,4} then resulting array should contain {1,2,3,4} The solution can be achieved using two for loops but that would be O(n^2) I believe.

like image 772
pranay Avatar asked Aug 08 '10 01:08

pranay


2 Answers

If, and only if, the source array is sorted, this can be done in linear time:

std::unique(a, a + 5); //Returns a pointer to the new logical end of a.

Otherwise you'll have to sort first, which is (99.999% of the time) n lg n.

like image 182
Billy ONeal Avatar answered Oct 21 '22 15:10

Billy ONeal


Best case is O(n log n). Perform a heap sort on the original array: O(n log n) in time, O(1)/in-place in space. Then run through the array sequentially with 2 indices (source & dest) to collapse out repetitions. This has the side effect of not preserving the original order, but since "remove duplicates" doesn't specify which duplicates to remove (first? second? last?), I'm hoping that you don't care that the order is lost.

If you do want to preserve the original order, there's no way to do things in-place. But it's trivial if you make an array of pointers to elements in the original array, do all your work on the pointers, and use them to collapse the original array at the end.

Anyone claiming it can be done in O(n) time and in-place is simply wrong, modulo some arguments about what O(n) and in-place mean. One obvious pseudo-solution, if your elements are 32-bit integers, is to use a 4-gigabit bit-array (512 megabytes in size) initialized to all zeros, flipping a bit on when you see that number and skipping over it if the bit was already on. Of course then you're taking advantage of the fact that n is bounded by a constant, so technically everything is O(1) but with a horrible constant factor. However, I do mention this approach since, if n is bounded by a small constant - for instance if you have 16-bit integers - it's a very practical solution.

like image 21
R.. GitHub STOP HELPING ICE Avatar answered Oct 21 '22 14:10

R.. GitHub STOP HELPING ICE