Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding duplicate elements with limited memory

The following is a question from Cracking the coding interview:

You have an array with all numbers from 1 to N, where N is at most 32,000. The array may have duplicate entries and you do not know what N is. With only 4KB of memory available, how would you print all duplicate elements in the array?

The method signature is

public static void checkDuplicates(int[] array)

Then the solution explains how you can use bit vector to solve this by representing each integer as a bit. My confusion is when we run this method, won't it load the entire array in memory to loop through it? Now if the array is of size say,for example, 1 billion (many repeated elements) won't this program fail since it loads the entire array in memory and the memory we have is 32 * 2^10 bits?

like image 911
Vivin Avatar asked Jan 05 '23 16:01

Vivin


2 Answers

Below is a tested code:

public void checkDuplicates(int[] nums){
    int bytesNeeded = (nums.length/8) + 1;
    byte[] bitSet = new byte[bytesNeeded];

    for(int i=0; i<nums.length; i++){
        int n = nums[i];
        int byteIndex = n / 8;
        int indexInByte = n % 8;

        byte bit = (byte)(bitSet[byteIndex] & (1 << indexInByte));
        if(bit > 0){
            System.out.print(nums[i] + " ");
        }else{
            bitSet[byteIndex] |= 1 << indexInByte; 
        }
    }
}
like image 102
Erika Dsouza Avatar answered Jan 15 '23 10:01

Erika Dsouza


This could be a tricky question. I recently interviewed at Google and they had some sort of questions like yours. I think the best to do in those cases to explain your line of thought and cover each cases. Those questions are constructed by humans too, so it is possible that they missed out a word etc. If I had to answer this question I would come up with multiple answers:

  • All memory usage could be 4KB (Problems etc)
  • Your solutions should fit into 4KB (The mentioned solution)

The text says that:

With only 4KB of memory available [...]

Because Java is an interesting language in terms of passing values, you do not create a new instance of the int array when it is passed to the method.

public class Test {
    public static void main(String[] args) {
        int[] stuff = {1};
        System.out.println("before: " + stuff[0]);
        doStuff(stuff);
        System.out.println("after: " + stuff[0]);
    }
    public static void doStuff(int[] array){
        array[0]=10;
    }
}

Because of this behaviour your 4KB is available for your inner processing algorithm. I think that this limitation is only to prevent the "I make a copy of it and..." kind of solutions.

like image 26
Hash Avatar answered Jan 15 '23 09:01

Hash