I have a list of dictionaries like the following:
lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]
I wrote a generator expression like:
next((itm for itm in lst if itm['a']==5))
Now the strange part is that though this works for the key value pair of 'a'
it throws an error for all other expressions the next time.
Expression:
next((itm for itm in lst if itm['b']==6))
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <genexpr>
KeyError: 'b'
Python generators are a simple way of creating iterators. All the work we mentioned above are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).
Generator Expressions In Python, generators provide a convenient way to implement the iterator protocol. Generator is an iterable created using a function with a yield statement. The main feature of generator is evaluating the elements on demand.
Instead of generating a list, in Python 3, you could splat the generator expression into a print statement. Ie) print(*(generator-expression)) . This prints the elements without commas and without brackets at the beginning and end.
So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list.
That's not weird. For every itm
in the lst
. It will first evaluate the filter clause. Now if the filter clause is itm['b'] == 6
, it will thus try to fetch the 'b'
key from that dictionary. But since the first dictionary has no such key, it will raise an error.
For the first filter example, that is not a problem, since the first dictionary has an 'a'
key. The next(..)
is only interested in the first element emitted by the generator. So it never asks to filter more elements.
You can use .get(..)
here to make the lookup more failsafe:
next((itm for itm in lst if itm.get('b',None)==6))
In case the dictionary has no such key, the .get(..)
part will return None
. And since None
is not equal to 6, the filter will thus omit the first dictionary and look further for another match. Note that if you do not specify a default value, None
is the default value, so an equivalent statement is:
next((itm for itm in lst if itm.get('b')==6))
We can also omit the parenthesis of the generator: only if there are multiple arguments, we need these additional parenthesis:
next(itm for itm in lst if itm.get('b')==6)
Take a look at your generator expression separately:
(itm for itm in lst if itm['a']==5)
This will collect all items in the list where itm['a'] == 5
. So far so good.
When you call next()
on it, you tell Python to generate the first item from that generator expression. But only the first.
So when you have the condition itm['a'] == 5
, the generator will take the first element of the list, {'a': 5}
and perform the check on it. The condition is true, so that item is generated by the generator expression and returned by next()
.
Now, when you change the condition to itm['b'] == 6
, the generator will again take the first element of the list, {'a': 5}
, and attempt to get the element with the key b
. This will fail:
>>> itm = {'a': 5}
>>> itm['b']
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
itm['b']
KeyError: 'b'
It does not even get the chance to look at the second element because it already fails while trying to look at the first element.
To solve this, you have to avoid using an expression that can raise a KeyError
here. You could use dict.get()
to attempt to retrieve the value without raising an exception:
>>> lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]
>>> next((itm for itm in lst if itm.get('b') == 6))
{'b': 6}
Obviously itm['b']
will raise a KeyError
if there is no 'b'
key in a dictionary. One way would be to do
next((itm for itm in lst if 'b' in itm and itm['b']==6))
If you don't expect None
in any of the dictionaries then you can simplify it to
next((itm for itm in lst if itm.get('b')==6))
(this will work the same since you compare to 6
, but it would give wrong result if you would compare to None
)
or safely with a placeholder
PLACEHOLDER = object()
next((itm for itm in lst if itm.get('b', PLACEHOLDER)==6))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With