Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the second largest number in a list in linear time

I'm learning Python and the simple ways to handle lists is presented as an advantage. Sometimes it is, but look at this:

>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7] >>> numbers.remove(max(numbers)) >>> max(numbers) 74 

A very easy, quick way of obtaining the second largest number from a list. Except that the easy list processing helps write a program that runs through the list twice over, to find the largest and then the 2nd largest. It's also destructive - I need two copies of the data if I wanted to keep the original. We need:

>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7] >>> if numbers[0]>numbers[1]): ...    m, m2 = numbers[0], numbers[1] ... else: ...    m, m2 = numbers[1], numbers[0] ... >>> for x in numbers[2:]: ...    if x>m2: ...       if x>m: ...          m2, m = m, x ...       else: ...          m2 = x ... >>> m2 74 

Which runs through the list just once, but isn't terse and clear like the previous solution.

So: is there a way, in cases like this, to have both? The clarity of the first version, but the single run through of the second?

like image 380
boisvert Avatar asked Apr 25 '13 22:04

boisvert


People also ask

What is the time taken to find the second largest element in an array?

A simple way to find the largest and second-largest number is that sort the array in descending order and pick its first and second elements. Its first element will be the largest number and the second number will be the second-largest number. The time complexity of this solution is O(n log n).


2 Answers

You could use the heapq module:

>>> el = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7] >>> import heapq >>> heapq.nlargest(2, el) [90.8, 74] 

And go from there...

like image 74
Jon Clements Avatar answered Sep 30 '22 04:09

Jon Clements


Since @OscarLopez and I have different opinions on what the second largest means, I'll post the code according to my interpretation and in line with the first algorithm provided by the questioner.

def second_largest(numbers):     count = 0     m1 = m2 = float('-inf')     for x in numbers:         count += 1         if x > m2:             if x >= m1:                 m1, m2 = x, m1                         else:                 m2 = x     return m2 if count >= 2 else None 

(Note: Negative infinity is used here instead of None since None has different sorting behavior in Python 2 and 3 – see Python - Find second smallest number; a check for the number of elements in numbers makes sure that negative infinity won't be returned when the actual answer is undefined.)

If the maximum occurs multiple times, it may be the second largest as well. Another thing about this approach is that it works correctly if there are less than two elements; then there is no second largest.

Running the same tests:

second_largest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]) => 74 second_largest([1,1,1,1,1,2]) => 1 second_largest([2,2,2,2,2,1]) => 2 second_largest([10,7,10]) => 10 second_largest([1,1,1,1,1,1]) => 1 second_largest([1]) => None second_largest([]) => None 

Update

I restructured the conditionals to drastically improve performance; almost by a 100% in my testing on random numbers. The reason for this is that in the original version, the elif was always evaluated in the likely event that the next number is not the largest in the list. In other words, for practically every number in the list, two comparisons were made, whereas one comparison mostly suffices – if the number is not larger than the second largest, it's not larger than the largest either.

like image 23
Thijs van Dien Avatar answered Sep 30 '22 02:09

Thijs van Dien