I'm trying to use read_csv
in pandas to read a zipped file from an FTP server. The zip file contains just one file, as is required.
Here's my code:
pd.read_csv('ftp://ftp.fec.gov/FEC/2016/cn16.zip', compression='zip')
I get this error:
AttributeError: addinfourl instance has no attribute 'seek'
I get this error in both pandas 18.1 and 19.0. Am I missing something, or could this be a bug?
pandas now supports to load data straight from zip or other compressed files to DataFrame.
compression : {‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, None}, default ‘infer’
For on-the-fly decompression of on-disk data. If ‘infer’ and filepath_or_buffer is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, or ‘.xz’ (otherwise no decompression). If using ‘zip’, the ZIP file must contain only one data file to be read in. Set to None for no decompression.
New in version 0.18.1: support for ‘zip’ and ‘xz’ compression.
import pandas as pd
df = pd.read_csv("path_to_file.zip")
# or
df = pd.read_csv("path_to_file.zip", compression="zip")
Although I'm not completely sure why you get the error, you can get around it by opening the url using urllib2
and writing the data to an in-memory binary stream, as shown here. In addition, we have to specify the correct separator, or else we would receive another error.
import io
import urllib2 as urllib
import pandas as pd
r = urllib.urlopen('ftp://ftp.fec.gov/FEC/2016/cn16.zip')
df = pd.read_csv(io.BytesIO(r.read()), compression='zip', sep='|', header=None)
As far as the error itself, I think pandas is trying to use seek on the "zip file" prior to downloading the url contents (so it's not really a zip file), which would result in that error.
header = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/54.0.1',}
remotezip = requests.get(url, headers=header)
root = zipfile.ZipFile(io.BytesIO(remotezip.content))
for name in root.namelist():
df = pd.read_csv(root.open(name))
Taken from my own blog post: Read zipped csv files in python pandas without downloading zipfile
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With