Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

renaming the extracted file from zipfile

I have lots of zipped files on a Linux server and each file includes multiple text files.

what I want is to extract some of those text files, which have the same name across zipped files and save it a folder; I am creating one folder for each zipped file and extract the text file to it. I need to add the parent zipped folder name to the end of file names and save all text files in one directory. For example, if the zipped folder was March132017.zip and I extracted holding.txt, my filename would be holding_march13207.txt.

My problem is that I am not able to change the extracted file's name. I would appreciate if you could advise.

import os 
import sys 
import zipfile
os.chdir("/feeds/lipper/emaxx") 

pwkwd = "/feeds/lipper/emaxx" 

for item in os.listdir(pwkwd): # loop through items in dir
    if item.endswith(".zip"): # check for ".zip" extension
        file_name = os.path.abspath(item) # get full path of files
        fh = open(file_name, "rb")
        zip_ref = zipfile.ZipFile(fh)

        filelist = 'ISSUERS.TXT' , 'SECMAST.TXT' , 'FUND.TXT' , 'HOLDING.TXT'
        for name in filelist :
            try:
                outpath = "/SCRATCH/emaxx" + "/" + os.path.splitext(item)[0]
                zip_ref.extract(name, outpath)

            except KeyError:
                {}

        fh.close()
like image 869
Roo Avatar asked May 19 '17 22:05

Roo


People also ask

How do I rename an extracted file?

Click the file or folder, then press F2. Click the file or folder twice, slowly. Right-click the file or folder and choose Rename from the shortcut menu.

How do I rename a zip file in Python?

ZipFile(file_path) as zf: # open the zip file for target_file in filelist: # loop through the list of files to extract if target_file in zf. namelist(): # check if the file exists in the archive # generate the desired output name: target_name = os. path.

Why can't I rename files in a zip file?

You can't, with normal tools, rename a file while it's in the ZIP archive. Best you can do is to rename files as you extract them. Though you could, with a slightly abnormal tool (a hex editor) go into the file and "zap" the "odd" characters (replace with "x", eg).

How can I rename a RAR file?

WinRAR has such a feature. Double click on the RAR file, select the file you like to rename, hit F2 and change the file name. When you hit Enter, the archive will automatically be updated.


3 Answers

import zipfile

zipdata = zipfile.ZipFile('somefile.zip')
zipinfos = zipdata.infolist()

# iterate through each file
for zipinfo in zipinfos:
    # This will do the renaming
    zipinfo.filename = do_something_to(zipinfo.filename)
    zipdata.extract(zipinfo)

Reference: https://bitdrop.st0w.com/2010/07/23/python-extracting-a-file-from-a-zip-file-with-a-different-name/

like image 150
Saikiran Gosikonda Avatar answered Nov 16 '22 01:11

Saikiran Gosikonda


Why not just read the file in question and save it yourself instead of extracting? Something like:

import os
import zipfile

source_dir = "/feeds/lipper/emaxx"  # folder with zip files
target_dir = "/SCRATCH/emaxx"  # folder to save the extracted files

# Are you sure your files names are capitalized in your zip files?
filelist = ['ISSUERS.TXT', 'SECMAST.TXT', 'FUND.TXT', 'HOLDING.TXT']

for item in os.listdir(source_dir):  # loop through items in dir
    if item.endswith(".zip"):  # check for ".zip" extension
        file_path = os.path.join(source_dir, item)  # get zip file path
        with zipfile.ZipFile(file_path) as zf:  # open the zip file
            for target_file in filelist:  # loop through the list of files to extract
                if target_file in zf.namelist():  # check if the file exists in the archive
                    # generate the desired output name:
                    target_name = os.path.splitext(target_file)[0] + "_" + os.path.splitext(file_path)[0] + ".txt"
                    target_path = os.path.join(target_dir, target_name)  # output path
                    with open(target_path, "w") as f:  # open the output path for writing
                        f.write(zf.read(target_file))  # save the contents of the file in it
                # next file from the list...
    # next zip file...
like image 41
zwer Avatar answered Nov 16 '22 00:11

zwer


You could simply run a rename after each file is extracted right? os.rename should do the trick.

zip_ref.extract(name, outpath)
parent_zip = os.path.basename(os.path.dirname(outpath)) + ".zip"
new_file_name = os.path.splitext(os.path.basename(name))[0] # just the filename

new_name_path = os.path.dirname(outpath) + os.sep + new_file_name + "_" + parent_zip
os.rename(outpath, new_namepath)

For the filename, if you want it to be incremental, simply start a count and for each file, go up by on.

count = 0
for file in files:
    count += 1
    # ... Do our file actions
    new_file_name = original_file_name + "_" + str(count)
    # ...

Or if you don't care about the end name you could always use something like a uuid.

import uuid
random_name = uuid.uuid4()
like image 45
mccatnm Avatar answered Nov 16 '22 00:11

mccatnm