Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter dictionary keys based on its corresponding values

I have:

dictionary = {"foo":12, "bar":2, "jim":4, "bob": 17} 

I want to iterate over this dictionary, but over the values instead of the keys, so I can use the values in another function.

For example, I want to test which dictionary values are greater than 6, and then store their keys in a list. My code looks like this:

list = [] for c in dictionary:     if c > 6:         list.append(dictionary[c]) print list 

and then, in a perfect world, list would feature all the keys whose value is greater than 6. However, my for loop is only iterating over the keys; I would like to change that to the values!

Any help is greatly appreciated. thank you

like image 579
Hoops Avatar asked May 08 '12 11:05

Hoops


People also ask

How do you filter a dictionary based on values?

Filter a Dictionary by values in Python using filter() filter() function iterates above all the elements in passed dict and filter elements based on condition passed as callback.

How do you separate a key from a value in a dictionary?

Creating a Dictionary To do that you separate the key-value pairs by a colon(“:”). The keys would need to be of an immutable type, i.e., data-types for which the keys cannot be changed at runtime such as int, string, tuple, etc. The values can be of any type.

Can keys of dictionary be accessed using values?

Keys are unique within a dictionary while values may not be. The values of a dictionary can be of any type, but the keys must be of an immutable data type such as strings, numbers, or tuples.

How do you sort a dictionary by it's value?

To sort a dictionary by value in Python you can use the sorted() function. Python's sorted() function can be used to sort dictionaries by key, which allows for a custom sorting method. sorted() takes three arguments: object, key, and reverse. Dictionaries are unordered data structures.


2 Answers

>>> d = {"foo": 12, "bar": 2, "jim": 4, "bob": 17} >>> [k for k, v in d.items() if v > 6] # Use d.iteritems() on python 2.x ['bob', 'foo'] 

I'd like to just update this answer to also showcase the solution by @glarrain which I find myself tending to use nowadays.

[k for k in d if d[k] > 6] 

This is completely cross compatible and doesn't require a confusing change from .iteritems (.iteritems avoids saving a list to memory on Python 2 which is fixed in Python 3) to .items.

@Prof.Falken mentioned a solution to this problem

from six import iteritems 

which effectively fixes the cross compatibility issues BUT requires you to download the package six

However I would not fully agree with @glarrain that this solution is more readable, that is up for debate and maybe just a personal preference even though Python is supposed to have only 1 way to do it. In my opinion it depends on the situation (eg. you may have a long dictionary name you don't want to type twice or you want to give the values a more readable name or some other reason)

Some interesting timings:

In Python 2, the 2nd solution is faster, in Python 3 they are almost exactly equal in raw speed.


$ python -m timeit -s 'd = {"foo": 12, "bar": 2, "jim": 4, "bob": 17};' '[k for k, v in d.items() if v > 6]' 1000000 loops, best of 3: 0.772 usec per loop $ python -m timeit -s 'd = {"foo": 12, "bar": 2, "jim": 4, "bob": 17};' '[k for k, v in d.iteritems() if v > 6]' 1000000 loops, best of 3: 0.508 usec per loop $ python -m timeit -s 'd = {"foo": 12, "bar": 2, "jim": 4, "bob": 17};' '[k for k in d if d[k] > 6]' 1000000 loops, best of 3: 0.45 usec per loop  $ python3 -m timeit -s 'd = {"foo": 12, "bar": 2, "jim": 4, "bob": 17};' '[k for k, v in d.items() if v > 6]' 1000000 loops, best of 3: 1.02 usec per loop $ python3 -m timeit -s 'd = {"foo": 12, "bar": 2, "jim": 4, "bob": 17};' '[k for k in d if d[k] > 6]' 1000000 loops, best of 3: 1.02 usec per loop 

However these are only tests for small dictionaries, in huge dictionaries I'm pretty sure that not having a dictionary key lookup (d[k]) would make .items much faster. And this seems to be the case

$ python -m timeit -s 'd = {i: i for i in range(-10000000, 10000000)};' -n 1 '[k for k in d if d[k] > 6]' 1 loops, best of 3: 1.75 sec per loop $ python -m timeit -s 'd = {i: i for i in range(-10000000, 10000000)};' -n 1 '[k for k, v in d.iteritems() if v > 6]' 1 loops, best of 3: 1.71 sec per loop $ python3 -m timeit -s 'd = {i: i for i in range(-10000000, 10000000)};' -n 1 '[k for k in d if d[k] > 6]' 1 loops, best of 3: 3.08 sec per loop $ python3 -m timeit -s 'd = {i: i for i in range(-10000000, 10000000)};' -n 1 '[k for k, v in d.items() if v > 6]' 1 loops, best of 3: 2.47 sec per loop 
like image 113
jamylak Avatar answered Oct 05 '22 23:10

jamylak


To just get the values, use dictionary.values()

To get key value pairs, use dictionary.items()

like image 36
Sionide21 Avatar answered Oct 05 '22 21:10

Sionide21