Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does zipfile.is_zipfile returns True on xlsx file?

I am using is_zipfile to check if it is a zipfile before extracting it. But the method returns True on excel file from a StringIO object. I am using Python 2.7. Does anyone know how to fix this? Is it reliable to use is_zipfiile? Thanks.

like image 322
Noel Pure Avatar asked Mar 07 '14 04:03

Noel Pure


People also ask

What is ZIP file ZIP file?

Python's zipfile is a standard library module intended to manipulate ZIP files. This file format is a widely adopted industry standard when it comes to archiving and compressing digital data. You can use it to package together several related files.


1 Answers

Quoting from the Microsoft's XLSX Structure overview doc,

Workbook data is contained in a ZIP package conforming to the Open Packaging Conventions

So, .xlsx files are actually zip files only. If you want not to consider them as a zip file, you may have to exclude with an if condition like this

if os.path.splitext(filename)[1] != ".xlsx" and zipfile.is_file(filename):
like image 155
thefourtheye Avatar answered Oct 13 '22 05:10

thefourtheye