Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python zipfile removes execute permissions from binaries

Not sure why this is happening, but when I run unzip a file (e.g. apache-groovy-binary-2.4.7.zip) at the command line...

  • directories are rwx-r-xr-x
  • files are rwxr-xr-x or rw-r--r--

But when I run zipfile.extractall() from a Python 2.7 script on the same file...

  • directories are rwx-r-x---
  • files are all rw-r---- - even the ones that should be executables as per above.

My umask setting is 0027—this partly explains what's going on, but why is the executable bit being removed from all files?

What's the easiest fix to get Python adopting similar behaviour to the command-line version (apart from shelling out, of course!)?

like image 930
RCross Avatar asked Sep 02 '16 15:09

RCross


People also ask

What does ZIP file ZIP file do?

The ZIP file format is a common archive and compression standard. This module provides tools to create, read, write, append, and list a ZIP file. Any advanced use of this module will require an understanding of the format, as defined in PKZIP Application Note.

Which statement successfully creates a ZIP file using the ZIP file module in Python?

with ZipFile(file_name, 'r') as zip: Here, a ZipFile object is made by calling ZipFile constructor which accepts zip file name and mode parameters. We create a ZipFile object in READ mode and name it as zip.

How do I use a ZIP file module in Python?

In the zipfile module, you'll find the ZipFile class. This class works pretty much like Python's built-in open() function, allowing you to open your ZIP files using different modes. The read mode ( "r" ) is the default. You can also use the write ( "w" ), append ( "a" ), and exclusive ( "x" ) modes.

How do I use ZIP file Extractall?

To unzip it first create a ZipFile object by opening the zip file in read mode and then call extractall() on that object i.e. It will extract all the files in zip at current Directory. If files with same name are already present at extraction location then it will overwrite those files.


1 Answers

The reason for this can be found in the _extract_member() method in zipfile.py, it only calls shutil.copyfileobj() which will write the output file without any execute bits.

The easiest way to solve this is by subclassing ZipFile and changing extract() (or patching in an extended version. By default it is:

def extract(self, member, path=None, pwd=None):
    """Extract a member from the archive to the current working directory,
       using its full name. Its file information is extracted as accurately
       as possible. `member' may be a filename or a ZipInfo object. You can
       specify a different directory using `path'.
    """
    if not isinstance(member, ZipInfo):
        member = self.getinfo(member)

    if path is None:
        path = os.getcwd()

    return self._extract_member(member, path, pwd)

This last line should be changed to actually set the mode based on the original attributes. You can do it this way:

import os
import sys
from zipfile import ZipFile, ZipInfo

class MyZipFile(ZipFile):

    if sys.version_info < (3, 6):

        def extract(self, member, path=None, pwd=None):
            if not isinstance(member, ZipInfo):
                member = self.getinfo(member)
            if path is None:
                path = os.getcwd()
            ret_val = self._extract_member(member, path, pwd)
            attr = member.external_attr >> 16
            os.chmod(ret_val, attr)
            return ret_val

    else:

        def _extract_member(member, ZipInfo):
            if not isinstance(member, ZipInfo):
                member = self.getinfo(member)
            path = super(ZipFile, self)._extract_member(member, targetpath, pwd)

            if member.external_attr >  0xffff:
                 os.chmod(path, member.external_attr >> 16)
            return path


with MyZipFile('test.zip') as zfp:
    zfp.extractall()

(The above is based on Python 3.5 and assumes the zipfile is called test.zip)

like image 105
Anthon Avatar answered Oct 06 '22 06:10

Anthon