Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to find missing elements in an integer sequence

Tags:

Suppose we have two items missing in a sequence of consecutive integers and the missing elements lie between the first and last elements. I did write a code that does accomplish the task. However, I wanted to make it efficient using less loops if possible. Any help will be appreciated. Also what about the condition when we have to find more missing items (say close to n/4) instead of 2. I think then my code should be efficient right because I am breaking out from the loop earlier?

def missing_elements(L,start,end,missing_num):     complete_list = range(start,end+1)     count = 0     input_index = 0     for item  in  complete_list:         if item != L[input_index]:             print item             count += 1         else :             input_index += 1         if count > missing_num:             break    def main():     L = [10,11,13,14,15,16,17,18,20]     start = 10     end = 20     missing_elements(L,start,end,2)    if __name__ == "__main__":     main() 
like image 296
vkaul11 Avatar asked Jun 06 '13 23:06

vkaul11


People also ask

How do you find multiple missing numbers in an integer array with duplicates?

To find the multiple missing elements run a loop inside it and see if the diff is less than arr[i] – i then print the missing element i.e., i + diff. Now increment the diff as the difference is increased now. Repeat from step 2 until all the missing numbers are not found.


2 Answers

If the input sequence is sorted, you could use sets here. Take the start and end values from the input list:

def missing_elements(L):     start, end = L[0], L[-1]     return sorted(set(range(start, end + 1)).difference(L)) 

This assumes Python 3; for Python 2, use xrange() to avoid building a list first.

The sorted() call is optional; without it a set() is returned of the missing values, with it you get a sorted list.

Demo:

>>> L = [10,11,13,14,15,16,17,18,20] >>> missing_elements(L) [12, 19] 

Another approach is by detecting gaps between subsequent numbers; using an older itertools library sliding window recipe:

from itertools import islice, chain  def window(seq, n=2):     "Returns a sliding window (of width n) over data from the iterable"     "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "     it = iter(seq)     result = tuple(islice(it, n))     if len(result) == n:         yield result         for elem in it:         result = result[1:] + (elem,)         yield result  def missing_elements(L):     missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)     return list(missing) 

This is a pure O(n) operation, and if you know the number of missing items, you can make sure it only produces those and then stops:

def missing_elements(L, count):     missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)     return list(islice(missing, 0, count)) 

This will handle larger gaps too; if you are missing 2 items at 11 and 12, it'll still work:

>>> missing_elements([10, 13, 14, 15], 2) [11, 12] 

and the above sample only had to iterate over [10, 13] to figure this out.

like image 139
Martijn Pieters Avatar answered Oct 17 '22 00:10

Martijn Pieters


Assuming that L is a list of integers with no duplicates, you can infer that the part of the list between start and index is completely consecutive if and only if L[index] == L[start] + (index - start) and similarly with index and end is completely consecutive if and only if L[index] == L[end] - (end - index). This combined with splitting the list into two recursively gives a sublinear solution.

# python 3.3 and up, in older versions, replace "yield from" with yield loop  def missing_elements(L, start, end):     if end - start <= 1:          if L[end] - L[start] > 1:             yield from range(L[start] + 1, L[end])         return      index = start + (end - start) // 2      # is the lower half consecutive?     consecutive_low =  L[index] == L[start] + (index - start)     if not consecutive_low:         yield from missing_elements(L, start, index)      # is the upper part consecutive?     consecutive_high =  L[index] == L[end] - (end - index)     if not consecutive_high:         yield from missing_elements(L, index, end)  def main():     L = [10,11,13,14,15,16,17,18,20]     print(list(missing_elements(L,0,len(L)-1)))     L = range(10, 21)     print(list(missing_elements(L,0,len(L)-1)))  main() 
like image 29
Lie Ryan Avatar answered Oct 16 '22 23:10

Lie Ryan