Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Got Exception Error “Exception in thread Thread-13 (most likely raised during interpreter shutdown)”

I wrote a simple script, which using threads to retrieve data from service.

    __author__ = 'Igor'
import requests
import time
from multiprocessing.dummy import Pool as ThreadPool

ip_list = []
good_ip_list = []
bad_ip_list = []
progress = 0

with open('/tmp/ip.txt') as f:
    ip_list = f.read().split()

def process_request(ip):
    global progress
    progress += 1
    if progress % 10000 == 0:
        print 'Processed ip:', progress, '...'
    r = requests.get('http://*****/?ip='+ip, timeout=None)
    if r.status_code == 200:
        good_ip_list.append(ip)
    elif r.status_code == 400:
        bad_ip_list.append(ip)
    else:
        print 'Unknown http code received, aborting'
        exit(1)

pool = ThreadPool(16)
try:
    pool.map(process_request, ip_list)
except:
    for name, ip_list in (('/tmp/out_good.txt', good_ip_list),     ('/tmp/out_bad.txt', bad_ip_list)):
        with open(name, 'w') as f:
            for ip in ip_list:
                print>>f, ip

But after some requests processed (40k-50k) i receive:

Exception in thread Thread-7 (most likely raised during interpreter shutdown): Traceback (most recent call last): Process finished with exit code 0

Tried to change service settings:

        <timeout>999</timeout>
        <connectionlimit>600</connectionlimit>
        <httpthreads>32</httpthreads>
        <workerthreads>128</workerthreads>

but still same error. Can anybody help me - what's wrong?

like image 271
Igor Avatar asked Mar 13 '15 08:03

Igor


2 Answers

Thanks to everybody, who helped me in solving this problem. Rewrote the whole code and now it works perfectly:

__author__ = 'kulakov'
import requests
import time
from multiprocessing.dummy import Pool as ThreadPool

ip_list = []
good_ip_list = []
bad_ip_list = []

with open('/tmp/ip.txt') as f:
    ip_list = f.read().split()

s = requests.Session()
def process_request(ip):
    r = s.get('http://*****/?ip='+ip, timeout=None)
    if r.status_code == 200:
        # good_ip_list.append(ip)
        return (ip, True)
    elif r.status_code == 400:
        # bad_ip_list.append(ip)
        return (ip, False)
    else:
        print 'Unknown http code received, aborting'
        exit(1)

pool = ThreadPool(16)
for ip, isOk in pool.imap(process_request, ip_list):
    if isOk:
        good_ip_list.append(ip)
    else:
        bad_ip_list.append(ip)
pool.close()
pool.join()

for name, ip_list in (('/tmp/out_good.txt', good_ip_list),    ('/tmp/out_bad.txt', bad_ip_list)):
    with open(name, 'w') as f:
        for ip in ip_list:
            print>>f, ip

Some new usefull information:

1) It was really bad idea to write data in different threads in a function process_request, now it returns statement(true\false) and ip.

2) keep alive is fully supported by requests, by default, but if you want to use it, you must create instance of an object Session, and apply get method to it only:

s = requests.Session()
r = s.get('http://*****/?ip='+ip, timeout=None)
like image 53
Igor Avatar answered Nov 10 '22 08:11

Igor


This:

good_ip_list = []
bad_ip_list = []

is not safe to mix with Python multiprocessing. The correct approach is to return a tuple (or something) from each call to process_request and then concatenate them all at the end. It's also not safe to modify progress concurrently from multiple processes. I'm not positive what your error is, but I bet it's some synchronization problem that is killing Python as a whole.

Remove the shared state and try again.

like image 37
Patrick Collins Avatar answered Nov 10 '22 09:11

Patrick Collins