Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Downloading a csv.gz file from url in Python

I'm having trouble downloading a csv.gz file from a url I have no problem downloading a tar.gz file. For the csv.gz file I'm able to extract the .gz file and read my csv file it would just be handy if I could use an URL instead of having the csv-1.0.csv.gz before hand

This works:

import urllib.request
urllib.request.urlretrieve('http://www.mywebsite.com/csv-1-0.tar.gz','csv-1-0.tar.gz')

This does not work:

import urllib.request
urllib.request.urlretrieve('http://www.mywebsite.com/csv-1-0.csv.gz','csv-1-0.csv.gz')

I get this error: UnicodeEncodeError: 'ascii' codec can't encode character '\xad' in position 9: ordinal not in range(128)

like image 422
Evan Ryan Avatar asked Dec 24 '22 11:12

Evan Ryan


1 Answers

As suggested at the very beginning of the docs for urllib.request, the excellent requests module is recommended for higher-level http client interfaces. The code is quite straightforward:

import requests

url = "http://www.mywebsite.com/csv-1-0.csv.gz"
filename = url.split("/")[-1]
with open(filename, "wb") as f:
    r = requests.get(url)
    f.write(r.content)

Basically, after assigning the URL and the destination file name, you open the destination file for writing in binary mode, request the file, then write the content of the request to the file. Done and done.

like image 149
MattDMo Avatar answered Feb 05 '23 17:02

MattDMo