Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

3rd way to find a duplicate

Two common ways to detect duplicates in an array:

1) sort first, time complexity O (n log n), space complexity O (1)

2) hash set, time complexity O (n), space complexity O (n)

Is there a 3rd way to detect a duplicate?

Please do not answer brute force.

like image 541
Dante May Code Avatar asked May 10 '11 09:05

Dante May Code


2 Answers

Yet another option is the Bloom filter. Complexity O(n), space varies (almost always less than a hash), but there is a possibility of false positives. There is a complex relationship between the size of the data set, the size of the filter, and the number of false positives that you can expect.

Bloom filters are often used as quick "sanity checks" before doing a more expensive duplicate check.

like image 163
btilly Avatar answered Nov 15 '22 06:11

btilly


Depends on the information.

If you know the range of the numbers, like 1-1000, you can use a bit array.

Lets say the range is a...b

Make a bit array with (b-a) bits. initialize them to 0.

Loop through the array, when you get to the number x , change the bit in place x-a to 1.

If there's a 1 there already, you have a duplicate.

time complexity: O(n)

space complexity: (b-a) bits

like image 29
Yochai Timmer Avatar answered Nov 15 '22 07:11

Yochai Timmer