I want to take an arbitrary number of paths that represent nested tar archives, and perform an operation on the innermost archive. The trouble is, the nesting can be arbitrary, so the number of context managers I need is also arbitrary.
Take, for example:
ARCHIVE_PATH = "path/to/archive.tar"
INNER_PATHS = (
"nested/within/archive/one.tar",
"nested/within/archive/two.tar",
# Arbitary number of these
)
def list_inner_contents(archive_path, inner_paths):
with TarFile(archive_path) as tf1:
with TarFile(fileobj=tf1.extractfile(inner_paths[0])) as tf2:
with TarFile(fileobj=tf2.extractfile(inner_paths[1])) as tf3:
# ...arbitary level of these!
return tfX.getnames()
contents = list_inner_contents(ARCHIVE_PATH, INNER_PATHS))
I can't use the with
statement's nesting syntax because there could be any number of levels to nest. I can't use contextlib.nested
because the docs say right there:
...using
nested()
to open two files is a programming error as the first file will not be closed promptly if an exception is thrown when opening the second file.
Is there a way to use language constructs to do this, or do I need to manually manage my own stack of open file objects?
Python provides an easy way to manage resources: Context Managers. The with keyword is used. When it gets evaluated it should result in an object that performs context management.
A context manager usually takes care of setting up some resource, e.g. opening a connection, and automatically handles the clean up when we are done with it. Probably, the most common use case is opening a file. The code above will open the file and will keep it open until we are out of the with statement.
__exit__() method The __exit__ method takes care of releasing the resources occupied with the current code snippet. This method must be executed no matter what after we are done with the resources.
For this case you may use recursion. It feels to be most natural for the case (of course if there's no special treatment in Python yet):
ARCHIVE_PATH = "path/to/archive.tar"
INNER_PATHS = [
"nested/within/archive/one.tar",
"nested/within/archive/two.tar",
# Arbitary number of these
]
def list_inner_contents(archive_path, inner_paths):
def rec(tf, rest_paths):
if not rest_paths:
return tf.getnames()
with TarFile(fileobj=tf.extractfile(rest_paths[0])) as tf2:
return rec(tf2, rest_paths[1:])
with TarFile(archive_path) as tf:
try:
return rec(tf, inner_paths)
except RuntimeError:
# We come here in case the inner_paths list is too long
# and we go too deeply in the recursion
return None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With