import threading
import Queue
import urllib2
import time
class ThreadURL(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
host = self.queue.get()
sock = urllib2.urlopen(host)
data = sock.read()
self.queue.task_done()
hosts = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.facebook.com', 'http://stackoverflow.com']
start = time.time()
def main():
queue = Queue.Queue()
for i in range(len(hosts)):
t = ThreadURL(queue)
t.start()
for host in hosts:
queue.put(host)
queue.join()
if __name__ == '__main__':
main()
print 'Elapsed time: {0}'.format(time.time() - start)
I've been trying to get my head around how to perform Threading and after a few tutorials, I've come up with the above.
What it's supposed to do is:
What I want to know first off is, am I doing this correctly? Is this the best way to handle threads?
Secondly, my program fails to exit. It prints out the Elapsed time
line and then hangs there. I have to kill my terminal for it to go away. I'm assuming this is due to my incorrect use of queue.join()
?
In fact, a Python process cannot run threads in parallel but it can run them concurrently through context switching during I/O bound operations. This limitation is actually enforced by GIL. The Python Global Interpreter Lock (GIL) prevents threads within the same process to be executed at the same time.
You can make a queue or line of tasks or objects by using the queue library in Python. Simply you can add a task to the queue (using put() method) or get a task out of the line and processes it (using get() method).
The Python standard library provides threading , which contains most of the primitives you'll see in this article. Thread , in this module, nicely encapsulates threads, providing a clean interface to work with them. When you create a Thread , you pass it a function and a list containing the arguments to that function.
You need to assign the thread object to a variable and then start it using that varaible: thread1=threading. Thread(target=f) followed by thread1. start() . Then you can do thread1.
Your code looks fine and is quite clean.
The reason your application still "hangs" is that the worker threads are still running, waiting for the main application to put something in the queue, even though your main thread is finished.
The simplest way to fix this is to mark the threads as daemons, by doing t.daemon = True
before your call to start. This way, the threads will not block the program stopping.
looks fine. yann is right about the daemon suggestion. that will fix your hang. my only question is why use the queue at all? you're not doing any cross thread communication, so it seems like you could just send the host info as an arg to ThreadURL init() and drop the queue.
nothing wrong with it, just wondering.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With