Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

finding top k largest keys in a dictionary python

Tags:

python

Lets say I have a dictionary:

{key1:value1........... keyn:valuen}

So lets say I want to write a function

def return_top_k(dictionary, k):

    return list_of_keys_sorted   

What is the most efficient way (in terms of big O) to get the keys which have the top k values (maintaining the order i.e the highest value key is present in the beginning.. and so on.)

like image 243
frazman Avatar asked Sep 04 '12 15:09

frazman


3 Answers

O(n log k):

import heapq

k_keys_sorted = heapq.nlargest(k, dictionary)

You could use key keyword parameter to specify what should be used as a sorting key e.g.:

k_keys_sorted_by_values = heapq.nlargest(k, dictionary, key=dictionary.get)
like image 125
jfs Avatar answered Oct 31 '22 04:10

jfs


return sorted(dictionary, key=dictionary.get, reverse=True)[:10]

Should be at worst O(NlogN) (although heapq proposed by others is probably better) ...

It might also make sense to use a Counter instead of a regular dictionary. In that case, the most_common method will do (approximately) what you want (dictionary.most_common(10)), but only if it makes sense to use a Counter in your API.

like image 7
mgilson Avatar answered Oct 31 '22 05:10

mgilson


portfolio = [
   {'name': 'IBM', 'shares': 100, 'price': 91.1},
   {'name': 'AAPL', 'shares': 50, 'price': 543.22},
   {'name': 'FB', 'shares': 200, 'price': 21.09},
   {'name': 'HPQ', 'shares': 35, 'price': 31.75},
   {'name': 'YHOO', 'shares': 45, 'price': 16.35},
   {'name': 'ACME', 'shares': 75, 'price': 115.65}
]

cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])
like image 4
Bala Avatar answered Oct 31 '22 05:10

Bala