Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter values inside Python generator expressions

I have a dictionary dct for which I want each of its values to be summed provided their corresponding keys exist in a specified list lst.

The code I am using so far is:

sum(dct[k] for k in lst)

In the above generator expression I would like to handle the KeyError in case a key from the list is not found inside the dictionary. I cannot seem to find how to implement (syntax-wise) either a try-except approach, nor an if-else approach inside this generator expression.

In case a key from the list is not found inside the dictionary, then it should carry on getting the other values. The end result of the sums should not be affected by any missing keys. In case none of the keys exist, then zero should be the sum's result.

like image 920
Yannis Avatar asked Jun 12 '16 09:06

Yannis


2 Answers

Well, there are few options, preferred one is to use dict.get():

# 1
sum(dct.get(k, 0) for k in lst)
# 2
sum(dct[k] for k in lst if k in dct)

Also one of the option is to filter lst before iteraring over it:

sum(dct[k] for k in filter(lambda i: i in dct, lst))

And you can use reduce function on filtered list as an alternative to sum:

reduce(lambda a, k: a + dct[k], filter(lambda i: i in dct, lst))

Now let's find fastest approach with timeit:

from timeit import timeit
import random

lst = range(0, 10000)
dct = {x:x for x in lst if random.choice([True, False])}

via_sum = lambda:(sum(dct.get(k, 0) for k in lst))
print("Via sum and get: %s" % timeit(via_sum, number=10000))
# Via sum and get: 16.725695848464966

via_sum_and_cond = lambda:(sum(dct[k] for k in lst if k in dct))
print("Via sum and condition: %s" % timeit(via_sum_and_cond, number=10000))
# Via sum and condition: 9.4715681076

via_reduce = lambda:(reduce(lambda a, k: a + dct[k], filter(lambda i: i in dct, lst)))
print("Via reduce: %s" % timeit(via_reduce, number=10000))
# Via reduce: 19.9522120953

So the fastest option is to sum items via if statement within generator expression

sum(dct[k] for k in lst if k in dct) # Via sum and condition: 9.4715681076

Good Luck !

like image 135
Andriy Ivaneyko Avatar answered Oct 10 '22 17:10

Andriy Ivaneyko


You have two options:

Checking if the key exists

sum(dct[k] for k in lst if k in dct)

or using get

sum(dct.get(k, 0) for k in lst)

where dct.get(k, 0) returns dct[k] if k is a key in dct or 0 if not.

like image 37
Wombatz Avatar answered Oct 10 '22 15:10

Wombatz