Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python tarfile and excludes

Tags:

python

tarfile

This is an excerpt from Python's documentation:

If exclude is given it must be a function that takes one filename argument and returns a boolean value. Depending on this value the respective file is either excluded (True) or added (False).

I must admit that I have no idea what that means.

Furthermore:

Deprecated since version 2.7: The exclude parameter is deprecated, please use the filter parameter instead. For maximum portability, filter should be used as a keyword argument rather than as a positional argument so that code won’t be affected when exclude is ultimately removed.

Ok... and the definition for "filter":

If filter is specified it must be a function that takes a TarInfo object argument and returns the changed TarInfo object. If it instead returns None the TarInfo object will be excluded from the archive.

... back to square one :)

What I really need is a way to pass an array (or a ":" delimited string) of excludes to the tarfile.add.

I would not mind if you try to explain what those passages from PyDocs ment.

P.S.:

This just crossed my mind:

  • Making an array of a list of source dir content
  • popping excludes
  • doing tar.add on individual array members that are left

But, I'd like it done in a more cultured way

like image 805
Kaurin Avatar asked Apr 14 '13 15:04

Kaurin


People also ask

How do I read a tar file in Python?

You can use the tarfile module to read and write tar files. To extract a tar file, you need to first open the file and then use the extract method of the tarfile module.


1 Answers

If exclude is given it must be a function that takes one filename argument and returns a boolean value. Depending on this value the respective file is either excluded (True) or added (False).

For example, if you wanted to exclude all filenames beginning with the letter 'a', you'd do something like...

def exclude_function(filename):
    if filename.startswith('a'):
        return True
    else:
        return False

mytarfile.add(..., exclude=exclude_function)

For your case, you'd want something like...

EXCLUDE_FILES = ['README', 'INSTALL', '.cvsignore']

def exclude_function(filename):
    if filename in EXCLUDE_FILES:
        return True
    else:
        return False

mytarfile.add(..., exclude=exclude_function)

...which can be reduced to...

EXCLUDE_FILES = ['README', 'INSTALL', '.cvsignore']

mytarfile.add(..., exclude=lambda x: x in EXCLUDE_FILES)

Update

TBH, I wouldn't worry too much about the deprecation warning, but if you want to use the new filter parameter, you'd need something like...

EXCLUDE_FILES = ['README', 'INSTALL', '.cvsignore']

def filter_function(tarinfo):
    if tarinfo.name in EXCLUDE_FILES:
        return None
    else:
        return tarinfo

mytarfile.add(..., filter=filter_function)

...which can be reduced to...

EXCLUDE_FILES = ['README', 'INSTALL', '.cvsignore']

mytarfile.add(..., filter=lambda x: None if x.name in EXCLUDE_FILES else x)
like image 54
Aya Avatar answered Sep 19 '22 11:09

Aya