Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OS starts killing processes when multi-threaded python process runs

This is the strangest thing!

I have a multi-threaded client application written in Python. I'm using threading to concurrently download and process pages. I would use the cURL multi-handle except that the bottleneck is definitely the processor (not the bandwidth) in this application so it is more efficient to use a thread pool.

I have a 64b i7 rocking 16GB RAM. Beefy. I launch 80 threads while listening to Pandora and trolling Stackoverflow and BAM! The parent process sometimes ends with the message

Killed

Other times a single page (which is it's own process in Chrome) will die. Other times the whole browser crashes.

If you want to see a bit of code here is the gist of it:

Here is the parent process:

def start( ):
  while True:
    for url in to_download:
      queue.put( ( url, uri_id ) )

    to_download = [ ]

    if queue.qsize( ) < BATCH_SIZE:
      to_download = get_more_urls( BATCH_SIZE )

    if threading.activeCount( ) < NUM_THREADS:
      for thread in threads:
        if not thread.isAlive( ):
          print "Respawning..."
          thread.join( )
          threads.remove( thread )
          t = ClientThread( queue )
          t.start( )
          threads.append( t )

    time.sleep( 0.5 )

And here is the gist of the ClientThread:

class ClientThread( threading.Thread ):

  def __init__( self, queue ):
    threading.Thread.__init__( self )
    self.queue = queue

  def run( self ):
    while True:
      try:
        self.url, self.url_id = self.queue.get( )
      except:
        raise SystemExit

      html = StringIO.StringIO( )
      curl = pycurl.Curl( )
      curl.setopt( pycurl.URL, self.url )
      curl.setopt( pycurl.NOSIGNAL, True )
      curl.setopt( pycurl.WRITEFUNCTION, html.write )
      curl.close( )

      try:
        curl.perform( )
      except pycurl.error, error:
        errno, errstr = error
        print errstr

      curl.close( )

EDIT: Oh, right...forgot to ask the question...should be obvious: Why do my processes get killed? Is it happening at the OS level? Kernel level? Is this due to a limitation on the number of open TCP connections I can have? Is it a limit on the number of threads I can run at once? The output of cat /proc/sys/kernel/threads-max is 257841. So...I don't think it's that....

I think I've got it...OK...I have no swap space at all on my drive. Is there a way to create some swap space now? I'm running Fedora 16. There WAS swap...then I enabled all my RAM and it disappeared magically. Tailing /var/log/messages I found this error:

Mar 26 19:54:03 gazelle kernel: [700140.851877] [15961]   500 15961    12455     7292   1       0             0 postgres
Mar 26 19:54:03 gazelle kernel: [700140.851880] Out of memory: Kill process 15258 (chrome) score 5 or sacrifice child
Mar 26 19:54:03 gazelle kernel: [700140.851883] Killed process 15258 (chrome) total-vm:214744kB, anon-rss:70660kB, file-rss:18956kB
Mar 26 19:54:05 gazelle dbus: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper)
like image 862
KeatsKelleher Avatar asked Mar 26 '12 23:03

KeatsKelleher


People also ask

Is Python actually multithreaded?

Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.

Does threading in Python use multiple cores?

Only a single thread can acquire that lock at a time, which means the interpreter ultimately runs the instructions serially. This design makes memory management thread-safe, but as a consequence, it can't utilize multiple CPU cores at all.

Is multiprocessing faster than multithreading in Python?

Both multithreading and multiprocessing allow Python code to run concurrently. Only multiprocessing will allow your code to be truly parallel. However, if your code is IO-heavy (like HTTP requests), then multithreading will still probably speed up your code.


1 Answers

You've triggered the kernel's Out Of Memory (OOM) handler; it selects which processes to kill in a complicated fashion that tries hard to kill as few processes as possible to make the most impact. Chrome apparently makes the most inviting process to kill under the criteria the kernel uses.

You can see a summary of the criteria in the proc(5) manpage under the /proc/[pid]/oom_score file:

   /proc/[pid]/oom_score (since Linux 2.6.11)
          This file displays the current score that the kernel
          gives to this process for the purpose of selecting a
          process for the OOM-killer.  A higher score means that
          the process is more likely to be selected by the OOM-
          killer.  The basis for this score is the amount of
          memory used by the process, with increases (+) or
          decreases (-) for factors including:

          * whether the process creates a lot of children using
            fork(2) (+);

          * whether the process has been running a long time, or
            has used a lot of CPU time (-);

          * whether the process has a low nice value (i.e., > 0)
            (+);

          * whether the process is privileged (-); and

          * whether the process is making direct hardware access
            (-).

          The oom_score also reflects the bit-shift adjustment
          specified by the oom_adj setting for the process.

You can adjust the oom_score file for your Python program if you want it to be the one that is killed.

Probably the better approach is adding more swap to your system to try to push off the time when the OOM-killer is invoked. Granted, having more swap doesn't necessarily mean that your system will never run out of memory -- and you might not care for the way it handles if there is a lot of swap traffic -- but it can at least get you past tight memory problems.

If you've already allocated all the space available for swap partitions, you can add swap files. Because they go through the filesystem, there is more overhead for swap files than swap partitions, but you can add them after the drive is partitioned, making it an easy short-term solution. You use the dd(1) command to allocate the file (do not use seek to make a sparse file) and then use mkswap(8) to format the file for swap use, then use swapon(8) to turn on that specific file. (I think you can even add swap files to fstab(5) to make them automatically available at next reboot, too, but I've never tried and don't know the syntax.)

like image 166
sarnold Avatar answered Sep 18 '22 08:09

sarnold