Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ridiculously low download speed with Python requests module

Problem:

I have been trying to make a simple anime downloader using Python's request module. I am tracking the progress using the progressbar2 module. While trying to download, I'm getting speed of 0.x B/s. I assumed the problem is about choosing the chunk_size based on this question. But I am getting the same negligible speeds irrespective of chunk size.

Specs and info:

  1. I am using Windows 10, Python 3.5, latest requests module (2.18.4) and have a decent internet with speed of 40Mbps.
  2. I can download the file from the link through browser(Chrome) and Free Download Manager in about 1 minute.
  3. The link is perfectly working and I have no firewall conflicts.

Code:

import os
import requests
import progressbar
from progressbar import *

os.chdir('D:\\anime\\ongoing')

widgets = ['Downloading: ', Percentage(), ' ', Bar(marker='#',left='[',right=']'),
           ' ', ETA(), FileTransferSpeed()]

url = 'https://lh3.googleusercontent.com/AtkUe87GbrINzTJS_Fj4W08CGqlOg9anwEF7n5-eKXcyS1RsaB8LdzRVaXloiJwiaX2IX1xqUiA=m22?title=(720P%20-%20mp4)Net-juu%20no%20Susume%20Episode%207'
r = requests.get(url,stream=True)
remotesize = r.headers['content-length']

print("Downloading {}.mp4!\n\n".format(url.split('title=')[1]))
pbar = ProgressBar(max_value=int(remotesize),widgets=widgets).start()
i = 0
with open('./tempy/tempy_file.mp4', 'wb') as f:
   for chunk in r.iter_content(chunk_size=5*1024*1024): 
      if chunk:
         i = i + len(chunk)
         f.write(chunk)
         pbar.update(int(i/int(remotesize) * 100))
pbar.finish()         
print("Successfully downloaded!\n\n")

Screenshot:

The speed is just ridiculous.

Expected Solution:

Not sure if this Github Issue was fixed.

  1. It would be preferable to find a solution within requests module but I am open to any answers within the scope of Python that can get me a good speed.
  2. I want the download to be chunk-wise because I want to see the progress via the progressbar. So shutil.copyfileobj(r.raw) isn't what I'm looking for.
  3. I did try using multiple threads but it only complicated things and didn't help. I think the problem is with writing the chunk to the buffer itself and splitting this task between threads doesn't help.

Edit:

As per suggestion, I tried it by including random user agents as shown:

desktop_agents = ['Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
                 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
                 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
                 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/602.2.14 (KHTML, like Gecko) Version/10.0.1 Safari/602.2.14',
                 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36',
                 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
                 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
                 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36',
                 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
                 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0']

def random_headers():
    return {'User-Agent': choice(desktop_agents),'Accept':'text/html,video/mp4,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'}

and sending the request with header as r = requests.get(url,stream=True,headers=random_headers())

However, it made no difference. :(

Edit no. 2:

Tried it with a sample video from "http://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_5mb.mp4". Same problem persists. :/

like image 965
Harshith Thota Avatar asked Apr 06 '26 13:04

Harshith Thota


1 Answers

So like the others suggested, google was throttling the speed. In order to overcome this, I used Selenium webdriver to download the links:

from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
prefs = {'download.default_directory' : dir_name}
            chrome_options.add_experimental_option('prefs', prefs)
            driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(li)

Well, at least I'm able to completely automate the download at the speed possible by google chrome's downloader.

So if anyone can help me figure this one out, please reply in the comments and I'll upvote them if helpful:

  1. Figure out a way in Python to use multiple connections for each file like the way Free Download Manager uses.

Here's the link to the complete script.

like image 192
Harshith Thota Avatar answered Apr 09 '26 03:04

Harshith Thota



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!