Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python zip a sub folder and not the entire folder path

I have a program to zip all the contents in a folder. I did not write this code but I found it somewhere online and I am using it. I intend to zip a folder for example say, C:/folder1/folder2/folder3/ . I want to zip folder3 and all its contents in a file say folder3.zip. With the below code, once i zip it, the contents of folder3.zip wil be folder1/folder2/folder3/and files. I do not want the entire path to be zipped and i only want the subfolder im interested to zip (folder3 in this case). I tried some os.chdir etc, but no luck.

def makeArchive(fileList, archive):
    """
    'fileList' is a list of file names - full path each name
    'archive' is the file name for the archive with a full path
    """
    try:
        a = zipfile.ZipFile(archive, 'w', zipfile.ZIP_DEFLATED)

        for f in fileList:
            print "archiving file %s" % (f)
            a.write(f)
        a.close()
        return True
    except: return False 

def dirEntries(dir_name, subdir, *args):
    # Creates a list of all files in the folder
    '''Return a list of file names found in directory 'dir_name'
    If 'subdir' is True, recursively access subdirectories under 'dir_name'.
    Additional arguments, if any, are file extensions to match filenames. Matched
        file names are added to the list.
    If there are no additional arguments, all files found in the directory are
        added to the list.
    Example usage: fileList = dirEntries(r'H:\TEMP', False, 'txt', 'py')
        Only files with 'txt' and 'py' extensions will be added to the list.
    Example usage: fileList = dirEntries(r'H:\TEMP', True)
        All files and all the files in subdirectories under H:\TEMP will be added
        to the list. '''

    fileList = []
    for file in os.listdir(dir_name):
        dirfile = os.path.join(dir_name, file)
        if os.path.isfile(dirfile):
            if not args:
                fileList.append(dirfile)
            else:
                if os.path.splitext(dirfile)[1][1:] in args:
                    fileList.append(dirfile)
            # recursively access file names in subdirectories
        elif os.path.isdir(dirfile) and subdir:
            print "Accessing directory:", dirfile
            fileList.extend(dirEntries(dirfile, subdir, *args))
    return fileList

You can call this by makeArchive(dirEntries(folder, True), zipname).

Any ideas as to how to solve this problem? I am uing windows OS annd python 25, i know in python 2.7 there is shutil make_archive which helps but since i am working on 2.5 i need another solution :-/

like image 681
Kiran6699 Avatar asked Jan 21 '13 12:01

Kiran6699


People also ask

Can you zip a folder with sub folders?

Creating a zip folder allows files to be organized and compressed to a smaller file size for distribution or saving space. Zip folder can have subfolders within this main folder.


2 Answers

You'll have to give an arcname argument to ZipFile.write() that uses a relative path. Do this by giving the root path to remove to makeArchive():

def makeArchive(fileList, archive, root):
    """
    'fileList' is a list of file names - full path each name
    'archive' is the file name for the archive with a full path
    """
    a = zipfile.ZipFile(archive, 'w', zipfile.ZIP_DEFLATED)

    for f in fileList:
        print "archiving file %s" % (f)
        a.write(f, os.path.relpath(f, root))
    a.close()

and call this with:

makeArchive(dirEntries(folder, True), zipname, folder)

I've removed the blanket try:, except:; there is no use for that here and only serves to hide problems you want to know about.

The os.path.relpath() function returns a path relative to root, effectively removing that root path from the archive entry.

On python 2.5, the relpath function is not available; for this specific usecase the following replacement would work:

def relpath(filename, root):
    return filename[len(root):].lstrip(os.path.sep).lstrip(os.path.altsep)

and use:

a.write(f, relpath(f, root))

Note that the above relpath() function only works for your specific case where filepath is guaranteed to start with root; on Windows the general case for relpath() is a lot more complex. You really want to upgrade to Python 2.6 or newer if at all possible.

like image 96
Martijn Pieters Avatar answered Sep 30 '22 23:09

Martijn Pieters


ZipFile.write has an optional argument arcname. Use this to remove parts of the path.

You could change your method to be:

def makeArchive(fileList, archive, path_prefix=None):
    """
    'fileList' is a list of file names - full path each name
    'archive' is the file name for the archive with a full path
    """
    try:
        a = zipfile.ZipFile(archive, 'w', zipfile.ZIP_DEFLATED)

        for f in fileList:
            print "archiving file %s" % (f)
            if path_prefix is None:
                a.write(f)
            else:
                a.write(f, f[len(path_prefix):] if f.startswith(path_prefix) else f)
        a.close()
        return True
    except: return False 

Martijn's approach using os.path is much more elegant, though.

like image 30
Thorsten Kranz Avatar answered Sep 30 '22 23:09

Thorsten Kranz