I'm downloading a huge set of files with following code in a loop: <pre class="prettyprint"><code>try: urllib.urlretrieve(url2download, destination_on_local_filesystem) except KeyboardInterrupt: break except: print "Timed-out or got some other exception: "+url2download </code></pre> If the server times-out on URL url2download when connection is just initiating, the last exception is handled properly. But sometimes server responded, and downloading is started, but the server is so slow, that it'll takes hours for even one file, and eventually it returns something like: <pre class="prettyprint"><code>Enter username for Clients Only at albrightandomalley.com: Enter password for in Clients Only at albrightandomalley.com: </code></pre> and just hangs there (although no username/passworde is aksed if the same link is downloaded through the browser). My intention in this situation would be -- skip this file and go to the next one. The question is -- how to do that? Is there a way in python to specify how long is OK to work on downloading one file, and if more time is already spent, interrupt, and go forward?

Try: <code>import socket</code> <code>socket.setdefaulttimeout(30)</code>

If you're not limited to what's shipped with python out of the box, then the urlgrabber module might come in handy: <pre class="prettyprint"><code>import urlgrabber urlgrabber.urlgrab(url2download, destination_on_local_filesystem, timeout=30.0) </code></pre>

There's a discussion of this here. Caveats (in addition to the ones they mention): I haven't tried it, and they're using <code>urllib2</code>, not <code>urllib</code> (would that be a problem for you?) (Actually, now that I think about it, this technique would probably work for <code>urllib</code>, too).

how to time-out gracefully while downloading with python

Tags:

python

exception-handling

download

I'm downloading a huge set of files with following code in a loop:

try:
    urllib.urlretrieve(url2download, destination_on_local_filesystem)
except KeyboardInterrupt:
    break
except:
    print "Timed-out or got some other exception: "+url2download

If the server times-out on URL url2download when connection is just initiating, the last exception is handled properly. But sometimes server responded, and downloading is started, but the server is so slow, that it'll takes hours for even one file, and eventually it returns something like:

Enter username for Clients Only at albrightandomalley.com:
Enter password for  in Clients Only at albrightandomalley.com:

and just hangs there (although no username/passworde is aksed if the same link is downloaded through the browser).

My intention in this situation would be -- skip this file and go to the next one. The question is -- how to do that? Is there a way in python to specify how long is OK to work on downloading one file, and if more time is already spent, interrupt, and go forward?

215

asked Mar 01 '09 23:03

user63503

4 Answers

Try:

import socket

socket.setdefaulttimeout(30)

124

answered Nov 10 '22 01:11

spiralmoon

If you're not limited to what's shipped with python out of the box, then the urlgrabber module might come in handy:

import urlgrabber
urlgrabber.urlgrab(url2download, destination_on_local_filesystem,
                   timeout=30.0)

answered Nov 10 '22 00:11

Сыч

There's a discussion of this here. Caveats (in addition to the ones they mention): I haven't tried it, and they're using urllib2, not urllib (would that be a problem for you?) (Actually, now that I think about it, this technique would probably work for urllib, too).

answered Nov 10 '22 02:11

Jacob Gabrielson

This question is more general about timing out a function: How to limit execution time of a function call in Python

I've used the method described in my answer there to write a wait for text function that times out to attempt an auto-login. If you'd like similar functionality you can reference the code here:

http://code.google.com/p/psftplib/source/browse/trunk/psftplib.py

answered Nov 10 '22 01:11

monkut

Related questions
                            
                                Errno 13 while running docker-compose up
                            
                                keras: how to use learning rate decay with model.train_on_batch()
                            
                                With pybind11, how to split my code into multiple modules/files?
                            
                                Truly Private Variables in Python 3
                            
                                Trouble loading local modules only with AWS Lambda
                            
                                Pytest says 'ModuleNotFoundError' when using tox
                            
                                Create Spark DataFrame from Pandas DataFrame
                            
                                how to subtract string type columns values from another column in pandas
                            
                                Python renaming Pandas DataFrame Columns
                            
                                psycopg2 - Inserting list of dictionaries into PosgreSQL database. Too many executions?
                            
                                Why was the name "arange" chosen for the numpy function?
                            
                                What is the difference between scipy.signal.spectrogram and scipy.signal.stft?
                            
                                When should one inherit from ABC?
                            
                                How can take advantage of multiprocessing and multithreading in Deep learning using Keras?
                            
                                Bare forward slash in Python function definition? [duplicate]
                            
                                TypeError: An asyncio.Future, a coroutine or an awaitable is required
                            
                                Python - ModuleNotFoundError: No module named
                            
                                TypeError('Keyword argument not understood:', 'groups') in keras.models load_model
                            
                                Geopandas warning on read_file()
                            
                                No module named wtforms.compat

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With