Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generators and files

When I write:

lines = (line.strip() for line in open('a_file'))

Is the file opened immediately or is the file system only accessed when I start to consume the generator expression?

like image 202
gboffi Avatar asked Aug 18 '17 13:08

gboffi


People also ask

What is a generator file?

File generators are invoked by actions to create new files, using options provided by the action, and handing the resulting file back to the action for further use. In this way, tasks can make choices about what files to generate, and what to do with the results.

Do generators save memory?

Generators are memory-friendly as they return and store the portion of data only when it is demanded. We can define generators with generators expressions or generator functions. We can develop memory-efficient data pipelines by using multiple generators.

What are the generators in Python?

Python Generators are the functions that return the traversal object and used to create iterators. It traverses the entire items at once. The generator can also be an expression in which syntax is similar to the list comprehension in Python.

What is difference between generator and iterator?

Iterators are the objects that use the next() method to get the next value of the sequence. A generator is a function that produces or yields a sequence of values using a yield statement. Classes are used to Implement the iterators. Functions are used to implement the generator.


3 Answers

It is opened immediately. You can verify this if you use a filename that's not present (it will throw an Exception which indicates that Python actually tried to open it immediatly).

You can also use a function that gives more feedback to see that the command is executed even before the generator is iterated over:

def somefunction(filename):
    print(filename)
    return open(filename)

lines = (line.strip() for line in somefunction('a_file'))  # prints

However if you use a generator function instead of a generator expression the file is only opened when you iterate over it:

def somefunction(filename):
    print(filename)
    for line in open(filename):
        yield line.strip()

lines = somefunction('a_file')  # no print!

list(lines)                     # prints because list iterates over the generator function.
like image 129
MSeifert Avatar answered Oct 15 '22 03:10

MSeifert


open() is called immediately upon the construction of the generator, irrespective of when or whether you consume from it.

The relevant spec is PEP-289:

Early Binding versus Late Binding

After much discussion, it was decided that the first (outermost) for-expression should be evaluated immediately and that the remaining expressions be evaluated when the generator is executed.

Asked to summarize the reasoning for binding the first expression, Guido offered [5]:

Consider sum(x for x in foo()). Now suppose there's a bug in foo() that raises an exception, and a bug in sum() that raises an exception before it starts iterating over its argument. Which exception would you expect to see? I'd be surprised if the one in sum() was raised rather the one in foo(), since the call to foo() is part of the argument to sum(), and I expect arguments to be processed before the function is called.

OTOH, in sum(bar(x) for x in foo()), where sum() and foo() are bugfree, but bar() raises an exception, we have no choice but to delay the call to bar() until sum() starts iterating -- that's part of the contract of generators. (They do nothing until their next() method is first called.)

See the rest of that section for further discussion.

like image 44
NPE Avatar answered Oct 15 '22 01:10

NPE


It is opened immediately.

Example:

def func():
    print('x')
    return [1, 2, 3]

g = (x for x in func())

Output:

x

The function needs to return an iterable object. open() returns an open file object that is iterable. Therefore, the file will be opened when you define the generator expression.

like image 2
Mike Müller Avatar answered Oct 15 '22 03:10

Mike Müller