Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ThreadPoolExecutor + Requests == deadlock?

I have a tiny stupid code, which makes a lot of requests to google search service

from concurrent.futures import ThreadPoolExecutor
import requests

import requests.packages.urllib3
requests.packages.urllib3.disable_warnings()


def check(page):
  r = requests.get('https://www.google.ru/#q=test&start={}'.format(page * 10))
  return len(r.text)

import time

def main():
  for q in xrange(30):
    st_t = time.time()
    with ThreadPoolExecutor(20) as pool:
      ret = [x for x in pool.map(check, xrange(1,1000))]
      print time.time() - st_t

if __name__ == "__main__":
  main()

And it works firstly, but then something is going wrong. All 20 threads are alive, but then they do nothing. I can see in the htop, that they are alive, but I actually don't understand why nothing happens.

Any ideas what could be wrong?

like image 615
Anton Hulikau Avatar asked Nov 30 '16 14:11

Anton Hulikau


Video Answer


1 Answers

This is a known issue and the requests team did not get enough information for debugging, see this. Possible it is a CPython issue see this.

like image 55
krizex Avatar answered Sep 20 '22 17:09

krizex