Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I nest an arbitrary number of Python file context managers?

I want to take an arbitrary number of paths that represent nested tar archives, and perform an operation on the innermost archive. The trouble is, the nesting can be arbitrary, so the number of context managers I need is also arbitrary.

Take, for example:

ARCHIVE_PATH = "path/to/archive.tar"

INNER_PATHS = (
    "nested/within/archive/one.tar",
    "nested/within/archive/two.tar",
    # Arbitary number of these
)

def list_inner_contents(archive_path, inner_paths):
    with TarFile(archive_path) as tf1:
        with TarFile(fileobj=tf1.extractfile(inner_paths[0])) as tf2:
            with TarFile(fileobj=tf2.extractfile(inner_paths[1])) as tf3:
                # ...arbitary level of these!
                return tfX.getnames()

contents = list_inner_contents(ARCHIVE_PATH, INNER_PATHS))

I can't use the with statement's nesting syntax because there could be any number of levels to nest. I can't use contextlib.nested because the docs say right there:

...using nested() to open two files is a programming error as the first file will not be closed promptly if an exception is thrown when opening the second file.

Is there a way to use language constructs to do this, or do I need to manually manage my own stack of open file objects?

like image 713
detly Avatar asked Apr 28 '13 04:04

detly


People also ask

Which of the following Python keyword is used for context management?

Python provides an easy way to manage resources: Context Managers. The with keyword is used. When it gets evaluated it should result in an object that performs context management.

What does @contextmanager do in Python?

A context manager usually takes care of setting up some resource, e.g. opening a connection, and automatically handles the clean up when we are done with it. Probably, the most common use case is opening a file. The code above will open the file and will keep it open until we are out of the with statement.

What does __ exit __ do in Python?

__exit__() method The __exit__ method takes care of releasing the resources occupied with the current code snippet. This method must be executed no matter what after we are done with the resources.


1 Answers

For this case you may use recursion. It feels to be most natural for the case (of course if there's no special treatment in Python yet):

ARCHIVE_PATH = "path/to/archive.tar"

INNER_PATHS = [
    "nested/within/archive/one.tar",
    "nested/within/archive/two.tar",
    # Arbitary number of these
]

def list_inner_contents(archive_path, inner_paths):
    def rec(tf, rest_paths):
        if not rest_paths:
            return tf.getnames()

        with TarFile(fileobj=tf.extractfile(rest_paths[0])) as tf2:
            return rec(tf2, rest_paths[1:])

    with TarFile(archive_path) as tf:
        try:
            return rec(tf, inner_paths)
        except RuntimeError:
            # We come here in case the inner_paths list is too long
            # and we go too deeply in the recursion
            return None
like image 88
Alexey Avatar answered Oct 19 '22 14:10

Alexey