Python Interpreter blocks Multithreaded DNS requests?

Tags:

I just played around a little bit with python and threads, and realized even in a multithreaded script, DNS requests are blocking. Consider the following script:

from threading import Thread import socket

class Connection(Thread):
    def __init__(self, name, url):
        Thread.__init__(self)
        self._url = url
        self._name = name

    def run(self):
        print "Connecting...", self._name
        try:
            s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            s.setblocking(0)
            s.connect((self._url, 80))
        except socket.gaierror:
            pass #not interested in it
        print "finished", self._name


if __name__ == '__main__':
    conns = []
    # all invalid addresses to see how they fail / check times
    conns.append(Connection("conn1", "www.2eg11erdhrtj.com"))
    conns.append(Connection("conn2", "www.e2ger2dh2rtj.com"))
    conns.append(Connection("conn3", "www.eg2de3rh1rtj.com"))
    conns.append(Connection("conn4", "www.ege2rh4rd1tj.com"))
    conns.append(Connection("conn5", "www.ege52drhrtj1.com"))

    for conn in conns:
        conn.start()

I dont know exactly how long the timeout is, but when running this the following happens:

All Threads start and I get my printouts
Every xx seconds, one thread displays finished, instead of all at once
The Threads finish sequentially, not all at once (timeout = same for all!)

So my only guess is that this has to do with the GIL? Obviously the threads do not perform their task concurrently, only one connection is attempted at a time.

Does anyone know a way around this?

(asyncore doesnt help, and I'd prefer not to use twisted for now) Isn't it possible to get this simple little thing done with python?

Greetings, Tom

edit:

I am on MacOSX, I just let my friend run this on linux, and he actually does get the results I wished to get. His socket.connects()'s return immediately, even in a non Threaded environment. And even when he sets the sockets to blocking, and timeout to 10 seconds, all his Threads finish at the same time.

Can anyone explain this?

389

asked Jul 31 '09 14:07

Tom

3 Answers

On some systems, getaddrinfo is not thread-safe. Python believes that some such systems are FreeBSD, OpenBSD, NetBSD, OSX, and VMS. On those systems, Python maintains a lock specifically for the netdb (i.e. getaddrinfo and friends).

So if you can't switch operating systems, you'll have to use a different (thread-safe) resolver library, such as twisted's.

187

answered Oct 23 '22 19:10

Martin v. Löwis

Send DNS requests asynchronously using Twisted Names:

import sys
from twisted.internet import reactor
from twisted.internet import defer
from twisted.names    import client
from twisted.python   import log

def process_names(names):
    log.startLogging(sys.stderr, setStdout=False)

    def print_results(results):
        for name, (success, result) in zip(names, results):
            if success:
                print "%s -> %s" % (name, result)
            else:
                print >>sys.stderr, "error: %s failed. Reason: %s" % (
                    name, result)

    d = defer.DeferredList(map(client.getHostByName, names), consumeErrors=True)
    d.addCallback(print_results)
    d.addErrback(defer.logError)
    d.addBoth(lambda _: reactor.stop())

reactor.callWhenRunning(process_names, """
    google.com
    www.2eg11erdhrtj.com 
    www.e2ger2dh2rtj.com
    www.eg2de3rh1rtj.com
    www.ege2rh4rd1tj.com
    www.ege52drhrtj1.com
    """.split())
reactor.run()

answered Oct 23 '22 20:10

jfs

if it's suitable you could use the multiprocessing module to enable process-based parallelism

import multiprocessing, socket

NUM_PROCESSES = 5

def get_url(url):
    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.setblocking(0)
        s.connect((url, 80))
    except socket.gaierror:
        pass #not interested in it
    return 'finished ' + url


def main(url_list):
    pool = multiprocessing.Pool( NUM_PROCESSES )
    for output in pool.imap_unordered(get_url, url_list):
        print output

if __name__=="__main__":
    main("""
             www.2eg11erdhrtj.com 
             www.e2ger2dh2rtj.com
             www.eg2de3rh1rtj.com
             www.ege2rh4rd1tj.com
             www.ege52drhrtj1.com
          """.split())

answered Oct 23 '22 18:10

Joe Koberg

Related questions
                            
                                Convert one-hot encoded data-frame columns into one column
                            
                                can't install pip anymore with python 2.7?
                            
                                How to keep the only the top N values in a dataframe
                            
                                Installing scipy and scikit-learn on apple m1
                            
                                best way to iterate through elements of pandas Series
                            
                                Comparison of Python and Perl solutions to Wide Finder challenge
                            
                                AJAX console window with ANSI/VT100 support?
                            
                                Python globals, locals, and UnboundLocalError
                            
                                What is the correct way to backup ZODB blobs?
                            
                                How to externally populate a Django model?
                            
                                How to configure IPython to use gvim on Windows?
                            
                                How to compare value of 2 fields in Django QuerySet?
                            
                                A Python walker that can ignore directories
                            
                                Why am I getting "'ResultSet' has no attribute 'findAll'" using BeautifulSoup in Python?
                            
                                Processing pairs of values from two sequences in Clojure
                            
                                item frequency in a python list of dictionaries
                            
                                How do I include a PHP script in Python?
                            
                                Restarting a Django application running on Apache + mod_python
                            
                                How would you inherit from and override the django model classes to create a listOfStringsField?
                            
                                Generating random sentences from custom text in Python's NLTK?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Interpreter blocks Multithreaded DNS requests?

Tags:

python

multithreading

network-programming

edit:

Tom

People also ask

3 Answers

Martin v. Löwis

jfs

Joe Koberg

Recent Activity

Donate For Us