In the examples below, resp.results is an iterator.
Version1 :
items = []
for result in resp.results:
item = process(result)
items.append(item)
return iter(items)
Version 2:
for result in resp.results:
yield process(result)
Is returning iter(items) in Version 1 any better/worse in terms of performance/memory savings than simply returning items?
In the "Python Cookbook," Alex says the explicit iter() is "more flexible but less often used," but what are the pros/cons of returning iter(items) vs yield as in Version 2?
Also, what are the best ways to unittest an iterator and/or yield? -- you can't do len(results) to check the size of the list?
Iterables, functions, and generators in Python Using yield will improve the memory efficiency — and subsequently, the speed/performance — when looping over a large iterable. These benefits will be less pronounced for small data sets; however, as data grows so too does the benefit of using yield .
The yield statement hauls the function and returns back the value to the function caller and restart from where it is left off. The yield statement can be called multiple times. While the return statement ends the execution of the function and returns the value back to the caller.
The yield keyword in Python controls the flow of a generator function. This is similar to a return statement used for returning values in Python.
The yield keyword pauses generator function execution and the value of the expression following the yield keyword is returned to the generator's caller. It can be thought of as a generator-based version of the return keyword. yield can only be called directly from the generator function that contains it.
It's easy to turn an iterator or generator back into a list if you need it:
results = [item for item in iterator]
Or as kindly pointed out in the comments, an even simpler method:
results = list(iterator)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With