When I write:
lines = (line.strip() for line in open('a_file'))
Is the file opened immediately or is the file system only accessed when I start to consume the generator expression?
File generators are invoked by actions to create new files, using options provided by the action, and handing the resulting file back to the action for further use. In this way, tasks can make choices about what files to generate, and what to do with the results.
Generators are memory-friendly as they return and store the portion of data only when it is demanded. We can define generators with generators expressions or generator functions. We can develop memory-efficient data pipelines by using multiple generators.
Python Generators are the functions that return the traversal object and used to create iterators. It traverses the entire items at once. The generator can also be an expression in which syntax is similar to the list comprehension in Python.
Iterators are the objects that use the next() method to get the next value of the sequence. A generator is a function that produces or yields a sequence of values using a yield statement. Classes are used to Implement the iterators. Functions are used to implement the generator.
It is opened immediately. You can verify this if you use a filename that's not present (it will throw an Exception which indicates that Python actually tried to open it immediatly).
You can also use a function that gives more feedback to see that the command is executed even before the generator is iterated over:
def somefunction(filename):
print(filename)
return open(filename)
lines = (line.strip() for line in somefunction('a_file')) # prints
However if you use a generator function instead of a generator expression the file is only opened when you iterate over it:
def somefunction(filename):
print(filename)
for line in open(filename):
yield line.strip()
lines = somefunction('a_file') # no print!
list(lines) # prints because list iterates over the generator function.
open()
is called immediately upon the construction of the generator, irrespective of when or whether you consume from it.
The relevant spec is PEP-289:
Early Binding versus Late Binding
After much discussion, it was decided that the first (outermost) for-expression should be evaluated immediately and that the remaining expressions be evaluated when the generator is executed.
Asked to summarize the reasoning for binding the first expression, Guido offered [5]:
Consider
sum(x for x in foo())
. Now suppose there's a bug infoo()
that raises an exception, and a bug insum()
that raises an exception before it starts iterating over its argument. Which exception would you expect to see? I'd be surprised if the one insum()
was raised rather the one infoo()
, since the call tofoo()
is part of the argument tosum()
, and I expect arguments to be processed before the function is called.OTOH, in
sum(bar(x) for x in foo())
, wheresum()
andfoo()
are bugfree, butbar()
raises an exception, we have no choice but to delay the call tobar()
untilsum()
starts iterating -- that's part of the contract of generators. (They do nothing until theirnext()
method is first called.)
See the rest of that section for further discussion.
It is opened immediately.
Example:
def func():
print('x')
return [1, 2, 3]
g = (x for x in func())
Output:
x
The function needs to return an iterable object.
open()
returns an open file object that is iterable.
Therefore, the file will be opened when you define the generator expression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With