Sort the top ten results

Question

I am getting a list in which I am saving the results in the following way

City Percentage
Mumbai  98.30
London 23.23
Agra    12.22
.....

List structure is [["Mumbai",98.30],["London",23.23]..]

I am saving this records in form of a list.I need the list to be sort top_ten records.Even if I get cities also, it would be fine.

I am trying to use the following logic, but it fails for to provide accurate data

if (condition):
    if b not in top_ten:
        top_ten.append(b)   
        top_ten.remove(tmp)

Any other solution,approach is also welcome.

EDIT 1

for a in sc_percentage:
            print a

List I am getting

(<ServiceCenter: DELHI-DLC>, 100.0)
(<ServiceCenter: DELHI-DLE>, 75.0)
(<ServiceCenter: DELHI-DLN>, 90.909090909090907)
(<ServiceCenter: DELHI-DLS>, 83.333333333333343)
(<ServiceCenter: DELHI-DLW>, 92.307692307692307)

Duncan · Accepted Answer

If the list is fairly short then as others have suggested you can sort it and slice it. If the list is very large then you may be better using heapq.nlargest():

>>> import heapq
>>> lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]
>>> heapq.nlargest(2, lis, key=lambda x:x[1])
[['Mumbai', 98.3], ['London', 23.23]]

The difference is that nlargest only makes a single pass through the list and in fact if you are reading from a file or other generated source need not all be in memory at the same time.

You might also be interested to look at the source for nlargest() as it works in much the same way that you were trying to solve the problem: it keeps only the desired number of elements in a data structure known as a heap and each new value is pushed into the heap then the smallest value is popped from the heap.

Edit to show comparative timing:

>>> import random
>>> records = []
>>> for i in range(100000):
    value = random.random() * 100
    records.append(('city {:2.4f}'.format(value), value))


>>> import heapq
>>> heapq.nlargest(10, records, key=lambda x:x[1])
[('city 99.9995', 99.99948904248298), ('city 99.9974', 99.99738898315216), ('city 99.9964', 99.99642759230214), ('city 99.9935', 99.99345173704319), ('city 99.9916', 99.99162694442714), ('city 99.9908', 99.99075084123544), ('city 99.9887', 99.98865134685201), ('city 99.9879', 99.98792632193258), ('city 99.9872', 99.98724339718686), ('city 99.9854', 99.98540548350132)]
>>> timeit.timeit('sorted(records, key=lambda x:x[1])[:10]', setup='from __main__ import records', number=10)
1.388942152229788
>>> timeit.timeit('heapq.nlargest(10, records, key=lambda x:x[1])', setup='import heapq;from __main__ import records', number=10)
0.5476185073315492

On my system getting the top 10 from 100 records is fastest by sorting and slicing, but with 1,000 or more records it is faster to use nlargest.

Sort the top ten results

Tags:

python

tuples

python-2.7

onkar

1 Answers

Duncan

Recent Activity

Donate For Us

Sort the top ten results

Tags:

python

tuples

python-2.7

onkar

1 Answers

Duncan

Related questions

Recent Activity

Donate For Us