Quick question, which is more efficient for finding the smallest number (float) in a long list (10000+ elements)
is it
min(mylist)
or
mylist.sort()
and then returning
mylist[0]
or something else...
thanks!
Method : Using min()/max() + float() This problem can be solved using the min or max function in which we first convert the strings into float and then pass this logic in functions in respective min/max function.
The Python min() function returns the lowest value in a list of items. min() can be used to find the smallest number in a list or first string that would appear in the list if the list were ordered alphabetically.
Use Python's min() and max() to find smallest and largest values in your data. Call min() and max() with a single iterable or with any number of regular arguments. Use min() and max() with strings and dictionaries.
Process: Initially assign the element located at 0 to min using min = l[0]. using for loop visit each location serially from 1 to len(l)-1. if the element located in any position is lesser than min, then assign the element as min by using min = l[i] finally min holds the minimum value in the list.
Frst, if you care about performance in Python (which isn't always a sensible thing to care about, but that's another conversation), you should be using the timeit module. Even in C it's hard to predict how certain functions will behave after compilation, and it's harder in Python. People are often confidently expressing opinions about which functions are faster which are data-dependent. Then -- by using timeit, I mean -- you could've found out yourself.
Second, if you really care about performance on lists of floats, you shouldn't be using lists at all, but numpy arrays. Using IPython here, under Python 2.7.2, which makes timing things easy:
In [41]: import random, numpy
In [42]: a = [0.1*i for i in range(10**5)]
In [43]: timeit min(a)
100 loops, best of 3: 4.55 ms per loop
In [44]: timeit sorted(a)[0]
100 loops, best of 3: 4.57 ms per loop
In [45]: random.shuffle(a)
In [46]: timeit min(a)
100 loops, best of 3: 6.06 ms per loop
In [47]: timeit min(a) # to make sure it wasn't a fluke
100 loops, best of 3: 6.07 ms per loop
In [48]: timeit sorted(a)[0]
10 loops, best of 3: 65.9 ms per loop
In [49]: b = numpy.array(a)
In [50]: timeit b.min()
10000 loops, best of 3: 97.5 us per loop
And we note a few things. (1) Python's sort (timsort) works very well on data which has sorted runs, so sorting an already sorted list has almost no penalty. (2) Sorting a random list, on the other hand, is very much slower, and this will only get worse as the data gets larger. (3) Numpy.min() on a float array works sixty times faster than min on a Python list, because it doesn't have to be as general.
If the list is already populated, min()
is the most efficient way.
There are some tricks you might use in special scenarios:
O(1)
.min()
. Make sure you understand the effects, mainly a slower insertion. (credit: interjay)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With