Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Readtimeout error for API data with Sodapy client

Tags:

python

timeout

I'm trying to make API calls on the consumer complaint dataset, available online (hhttps://data.consumerfinance.gov/dataset/Consumer-Complaints/s6ew-h6mp) with the SodaPy library (https://github.com/xmunoz/sodapy). I just want to get the csv data, the webpage says it has 906182 rows,

I've followed the example on GitHub as best as I can, but it's just not working. Here's the code:

from sodapy import Socrata

client = Socrata("data.consumerfinance.gov", "apptoken", username="myusername", password="mypassword")

results = client.get("s6ew-h6mp")

I want to get the entire dataset,but I keep getting the following error:

ReadTimeout: HTTPSConnectionPool(host='data.consumerfinance.gov', port=443): Read timed out. (read timeout=10)

Any clues on how to work through this?

like image 557
disname Avatar asked Jan 30 '23 03:01

disname


2 Answers

By default, the Socrata connection will timeout after 10 seconds.

You are able to increase the timeout limit for the Socrata client by updating the 'timeout' instance variable like so:

from sodapy import Socrata

client = Socrata("data.consumerfinance.gov", "apptoken", username="myusername", password="mypassword")

# change the timeout variable to an arbitrarily large number of seconds
client.timeout = 50

results = client.get("s6ew-h6mp")
like image 195
sawyer Avatar answered Jan 31 '23 18:01

sawyer


It's possible that the connection is timing out because the file is too large. You can try to download a subset of the data using the limit option, e.g.

results = client.get("s6ew-h6mp", limit=1000)

You can also query subsets of the data using SoQL keywords.

Otherwise, the sodapy module is built on the requests module so looking at the documentation for that could be useful.

like image 42
Adam Scherling Avatar answered Jan 31 '23 20:01

Adam Scherling