Get the second largest number in a list in linear time

Tags:

python

I'm learning Python and the simple ways to handle lists is presented as an advantage. Sometimes it is, but look at this:

Click to copy

>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7] >>> numbers.remove(max(numbers)) >>> max(numbers) 74

A very easy, quick way of obtaining the second largest number from a list. Except that the easy list processing helps write a program that runs through the list twice over, to find the largest and then the 2nd largest. It's also destructive - I need two copies of the data if I wanted to keep the original. We need:

Click to copy

>>> numbers = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7] >>> if numbers[0]>numbers[1]): ...    m, m2 = numbers[0], numbers[1] ... else: ...    m, m2 = numbers[1], numbers[0] ... >>> for x in numbers[2:]: ...    if x>m2: ...       if x>m: ...          m2, m = m, x ...       else: ...          m2 = x ... >>> m2 74

Which runs through the list just once, but isn't terse and clear like the previous solution.

So: is there a way, in cases like this, to have both? The clarity of the first version, but the single run through of the second?

380

asked Apr 25 '13 22:04

boisvert

2 Answers

You could use the heapq module:

Click to copy

>>> el = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7] >>> import heapq >>> heapq.nlargest(2, el) [90.8, 74]

And go from there...

answered Sep 30 '22 04:09

Jon Clements

Since @OscarLopez and I have different opinions on what the second largest means, I'll post the code according to my interpretation and in line with the first algorithm provided by the questioner.

Click to copy

def second_largest(numbers):     count = 0     m1 = m2 = float('-inf')     for x in numbers:         count += 1         if x > m2:             if x >= m1:                 m1, m2 = x, m1                         else:                 m2 = x     return m2 if count >= 2 else None

(Note: Negative infinity is used here instead of None since None has different sorting behavior in Python 2 and 3 – see Python - Find second smallest number; a check for the number of elements in numbers makes sure that negative infinity won't be returned when the actual answer is undefined.)

If the maximum occurs multiple times, it may be the second largest as well. Another thing about this approach is that it works correctly if there are less than two elements; then there is no second largest.

Running the same tests:

Click to copy

second_largest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]) => 74 second_largest([1,1,1,1,1,2]) => 1 second_largest([2,2,2,2,2,1]) => 2 second_largest([10,7,10]) => 10 second_largest([1,1,1,1,1,1]) => 1 second_largest([1]) => None second_largest([]) => None

Update

I restructured the conditionals to drastically improve performance; almost by a 100% in my testing on random numbers. The reason for this is that in the original version, the elif was always evaluated in the likely event that the next number is not the largest in the list. In other words, for practically every number in the list, two comparisons were made, whereas one comparison mostly suffices – if the number is not larger than the second largest, it's not larger than the largest either.

answered Sep 30 '22 02:09

Thijs van Dien

Related questions
                            
                                NotImplementedError: Cannot convert a symbolic Tensor (2nd_target:0) to a numpy array
                            
                                Error when calling the metaclass bases: function() argument 1 must be code, not str
                            
                                Pyspark replace strings in Spark dataframe column
                            
                                How to make an axes occupy multiple subplots with pyplot (Python)
                            
                                List of all unique characters in a string?
                            
                                Schedule a repeating event in Python 3
                            
                                Explain the aggregate functionality in Spark (with Python and Scala)
                            
                                create & read from tempfile
                            
                                Python equivalent for #ifdef DEBUG
                            
                                VSCode pytest test discovery fails
                            
                                Adding folders to a zip file using python
                            
                                How to share the global app object in flask?
                            
                                Complexity of len() with regard to sets and lists
                            
                                Get the list of figures in matplotlib
                            
                                How to get every first element in 2 dimensional list
                            
                                Installed virtualenv and virtualenvwrapper: Python says no module named virtualenvwrapper
                            
                                How do I install python on alpine linux?
                            
                                PIL /JPEG Library: "decoder jpeg not available"
                            
                                Sliding window of M-by-N shape numpy.ndarray
                            
                                python list comprehension with multiple 'if's

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get the second largest number in a list in linear time

Tags:

performance

python

boisvert

People also ask

2 Answers

Jon Clements

Thijs van Dien

Recent Activity

Donate For Us