Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading csv zipped files in python

I'm trying to get data from a zipped csv file. Is there a way to do this without unzipping the whole files? If not, how can I unzip the files and read them efficiently?

like image 522
Elyza Agosta Avatar asked Nov 15 '14 04:11

Elyza Agosta


People also ask

Can you read a zipped file Python?

Yes you can. If you want to read a zipped or a tar. gz file into pandas dataframe, the read_csv methods includes this particular implementation. For on-the-fly decompression of on-disk data.

Can pandas read zipped CSV?

Read a File from Multiple Files in Zip Folder csv file. Pandas cannot directly read data from a zip folder if there are multiple files; to solve this, we will use the zipfile module within Python. The zipfile module offers two routes for reading in zip data : ZipFile and Path classes.

How do I read a zip file using pandas?

Method #1: Using compression=zip in pandas. read_csv() method. By assigning the compression argument in read_csv() method as zip, then pandas will first decompress the zip and then will create the dataframe from CSV file present in the zipped file.


1 Answers

I used the zipfile module to import the ZIP directly to pandas dataframe. Let's say the file name is "intfile" and it's in .zip named "THEZIPFILE":

import pandas as pd import zipfile  zf = zipfile.ZipFile('C:/Users/Desktop/THEZIPFILE.zip')  df = pd.read_csv(zf.open('intfile.csv')) 
like image 159
Yaron Avatar answered Oct 04 '22 23:10

Yaron