Fastest way to download thousand files using python? [closed]

Question

I need to download a thousand csv files size: 20KB - 350KB. Here is my code so far:

Im using urllib.request.urlretrieve. And with it i download thousand files with size of all of them together: 250MB, for over an hour.

So my question is:

How can I download thousand csv files faster then one hour?

Thank you!

Lennart Regebro · Accepted Answer

Most likely the reason it takes so long is that it takes time to open a connection make the request, get the file and close the connection again.

A thousand files in an hour is 3.6 seconds per file, which is high, but the site you are downloading from may be slow.

The first thing to do is to use HTTP/2.0 and keep one conection open for all the files with Keep-Alive. The easiest way to do that is to use the Requests library, and use a session.

If this isn't fast enough, then you need to do several parallel downloads with either multiprocessing or threads.

Juri Robl · Answer

You should try using multithreading to download many files in parallel. Have a look at multiprocessing and especially the worker-pools.

Fastest way to download thousand files using python? [closed]

Tags:

python

python-3.x

csv

urllib

Michael

2 Answers

Lennart Regebro

Juri Robl

Recent Activity

Donate For Us

Fastest way to download thousand files using python? [closed]

Tags:

python

python-3.x

csv

urllib

Michael

2 Answers

Lennart Regebro

Juri Robl

Related questions

Recent Activity

Donate For Us