Find the Smallest Integer Not in a List

2 Answers

If the datastructure can be mutated in place and supports random access then you can do it in O(N) time and O(1) additional space. Just go through the array sequentially and for every index write the value at the index to the index specified by value, recursively placing any value at that location to its place and throwing away values > N. Then go again through the array looking for the spot where value doesn't match the index - that's the smallest value not in the array. This results in at most 3N comparisons and only uses a few values worth of temporary space.

# Pass 1, move every value to the position of its value for cursor in range(N):     target = array[cursor]     while target < N and target != array[target]:         new_target = array[target]         array[target] = target         target = new_target  # Pass 2, find first location where the index doesn't match the value for cursor in range(N):     if array[cursor] != cursor:         return cursor return N

195

answered Dec 03 '22 04:12

Ants Aasma

Here's a simple O(N) solution that uses O(N) space. I'm assuming that we are restricting the input list to non-negative numbers and that we want to find the first non-negative number that is not in the list.

Find the length of the list; lets say it is N.
Allocate an array of N booleans, initialized to all false.
For each number X in the list, if X is less than N, set the X'th element of the array to true.
Scan the array starting from index 0, looking for the first element that is false. If you find the first false at index I, then I is the answer. Otherwise (i.e. when all elements are true) the answer is N.

In practice, the "array of N booleans" would probably be encoded as a "bitmap" or "bitset" represented as a byte or int array. This typically uses less space (depending on the programming language) and allows the scan for the first false to be done more quickly.

This is how / why the algorithm works.

Suppose that the N numbers in the list are not distinct, or that one or more of them is greater than N. This means that there must be at least one number in the range 0 .. N - 1 that is not in the list. So the problem of find the smallest missing number must therefore reduce to the problem of finding the smallest missing number less than N. This means that we don't need to keep track of numbers that are greater or equal to N ... because they won't be the answer.

The alternative to the previous paragraph is that the list is a permutation of the numbers from 0 .. N - 1. In this case, step 3 sets all elements of the array to true, and step 4 tells us that the first "missing" number is N.

The computational complexity of the algorithm is O(N) with a relatively small constant of proportionality. It makes two linear passes through the list, or just one pass if the list length is known to start with. There is no need to represent the hold the entire list in memory, so the algorithm's asymptotic memory usage is just what is needed to represent the array of booleans; i.e. O(N) bits.

(By contrast, algorithms that rely on in-memory sorting or partitioning assume that you can represent the entire list in memory. In the form the question was asked, this would require O(N) 64-bit words.)

@Jorn comments that steps 1 through 3 are a variation on counting sort. In a sense he is right, but the differences are significant:

A counting sort requires an array of (at least) Xmax - Xmin counters where Xmax is the largest number in the list and Xmin is the smallest number in the list. Each counter has to be able to represent N states; i.e. assuming a binary representation it has to have an integer type (at least) ceiling(log2(N)) bits.
To determine the array size, a counting sort needs to make an initial pass through the list to determine Xmax and Xmin.
The minimum worst-case space requirement is therefore ceiling(log2(N)) * (Xmax - Xmin) bits.

By contrast, the algorithm presented above simply requires N bits in the worst and best cases.

However, this analysis leads to the intuition that if the algorithm made an initial pass through the list looking for a zero (and counting the list elements if required), it would give a quicker answer using no space at all if it found the zero. It is definitely worth doing this if there is a high probability of finding at least one zero in the list. And this extra pass doesn't change the overall complexity.

EDIT: I've changed the description of the algorithm to use "array of booleans" since people apparently found my original description using bits and bitmaps to be confusing.

answered Dec 03 '22 04:12

Stephen C

Related questions
                            
                                Removing from array during enumeration in Swift?
                            
                                How to check identical array in most efficient way? [duplicate]
                            
                                Why do PHP Array Examples Leave a Trailing Comma?
                            
                                Remove objects with a duplicate property from List
                            
                                Create thread safe array in Swift
                            
                                Recursive function to generate multidimensional array from database result
                            
                                Uncaught TypeError: Object.values is not a function JavaScript
                            
                                Java 8 times faster with arrays than std::vector in C++. What did I do wrong?
                            
                                When to use HashMap over LinkedList or ArrayList and vice-versa
                            
                                Java stream toArray() convert to a specific type of array
                            
                                What do square brackets mean in array initialization in C?
                            
                                Access array returned by a function in php
                            
                                ArrayList: how does the size increase?
                            
                                C++ pass an array by reference
                            
                                AssertEquals 2 Lists ignore order
                            
                                Index of element in NumPy array [duplicate]
                            
                                How to find the index of an element in an int array?
                            
                                How can I dynamically add items to a Java array?
                            
                                String search in string array in objective c
                            
                                Count all values in a matrix greater than a value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find the Smallest Integer Not in a List

Tags:

arrays

algorithm

PeterAllenWebb

People also ask

2 Answers

Ants Aasma

Stephen C

Recent Activity

Donate For Us