During one of technical interview got this question. I know way to solve this problem using (in java) HashSet.
But i could not understand when interviewer forced on the word "a very large array let's say 10 million elements in the given array".
Do i need to change the approach? If not, what should be the efficient to achieve this?
PS: Algo or implementation is language agnostic.
Thank you.
There were some key things, which interviewer expected you to ask back like: if you can not load the array in memory, then how much I can load
. These are the steps to solve the problem:
k parts
. You load the first 1M and build Min Heap
of it. Then remove the top and apply the Heapify on Min Heap
.Min Heap
.Min Heap
and store the value in temporary variable
as well for comparing with the next coming number for finding the duplicates.Min Heap
and apply Heapify.Min Heap
is your next sorted number and compare it with the temporary variable for finding the duplicates. Update the
temporary variable` if number is not duplicate.You can do it in O(nlog(n)):
I think that is what the interviewer wanted to hear.
if you did a merge sort or a quick sort, finding the duplicates could be done at merging in hidden time. These can be implemented "in-place", or "by-part" if the array is too large to fit in memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With