This is the strangest thing!
I have a multi-threaded client application written in Python. I'm using threading to concurrently download and process pages. I would use the cURL multi-handle except that the bottleneck is definitely the processor (not the bandwidth) in this application so it is more efficient to use a thread pool.
I have a 64b i7 rocking 16GB RAM. Beefy. I launch 80 threads while listening to Pandora and trolling Stackoverflow and BAM! The parent process sometimes ends with the message
Killed
Other times a single page (which is it's own process in Chrome) will die. Other times the whole browser crashes.
If you want to see a bit of code here is the gist of it:
Here is the parent process:
def start( ):
while True:
for url in to_download:
queue.put( ( url, uri_id ) )
to_download = [ ]
if queue.qsize( ) < BATCH_SIZE:
to_download = get_more_urls( BATCH_SIZE )
if threading.activeCount( ) < NUM_THREADS:
for thread in threads:
if not thread.isAlive( ):
print "Respawning..."
thread.join( )
threads.remove( thread )
t = ClientThread( queue )
t.start( )
threads.append( t )
time.sleep( 0.5 )
And here is the gist of the ClientThread:
class ClientThread( threading.Thread ):
def __init__( self, queue ):
threading.Thread.__init__( self )
self.queue = queue
def run( self ):
while True:
try:
self.url, self.url_id = self.queue.get( )
except:
raise SystemExit
html = StringIO.StringIO( )
curl = pycurl.Curl( )
curl.setopt( pycurl.URL, self.url )
curl.setopt( pycurl.NOSIGNAL, True )
curl.setopt( pycurl.WRITEFUNCTION, html.write )
curl.close( )
try:
curl.perform( )
except pycurl.error, error:
errno, errstr = error
print errstr
curl.close( )
EDIT: Oh, right...forgot to ask the question...should be obvious: Why do my processes get killed? Is it happening at the OS level? Kernel level? Is this due to a limitation on the number of open TCP connections I can have? Is it a limit on the number of threads I can run at once? The output of cat /proc/sys/kernel/threads-max
is 257841
. So...I don't think it's that....
I think I've got it...OK...I have no swap space at all on my drive. Is there a way to create some swap space now? I'm running Fedora 16. There WAS swap...then I enabled all my RAM and it disappeared magically. Tailing /var/log/messages
I found this error:
Mar 26 19:54:03 gazelle kernel: [700140.851877] [15961] 500 15961 12455 7292 1 0 0 postgres
Mar 26 19:54:03 gazelle kernel: [700140.851880] Out of memory: Kill process 15258 (chrome) score 5 or sacrifice child
Mar 26 19:54:03 gazelle kernel: [700140.851883] Killed process 15258 (chrome) total-vm:214744kB, anon-rss:70660kB, file-rss:18956kB
Mar 26 19:54:05 gazelle dbus: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper)
Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.
Only a single thread can acquire that lock at a time, which means the interpreter ultimately runs the instructions serially. This design makes memory management thread-safe, but as a consequence, it can't utilize multiple CPU cores at all.
Both multithreading and multiprocessing allow Python code to run concurrently. Only multiprocessing will allow your code to be truly parallel. However, if your code is IO-heavy (like HTTP requests), then multithreading will still probably speed up your code.
You've triggered the kernel's Out Of Memory (OOM) handler; it selects which processes to kill in a complicated fashion that tries hard to kill as few processes as possible to make the most impact. Chrome apparently makes the most inviting process to kill under the criteria the kernel uses.
You can see a summary of the criteria in the proc(5)
manpage under the /proc/[pid]/oom_score
file:
/proc/[pid]/oom_score (since Linux 2.6.11)
This file displays the current score that the kernel
gives to this process for the purpose of selecting a
process for the OOM-killer. A higher score means that
the process is more likely to be selected by the OOM-
killer. The basis for this score is the amount of
memory used by the process, with increases (+) or
decreases (-) for factors including:
* whether the process creates a lot of children using
fork(2) (+);
* whether the process has been running a long time, or
has used a lot of CPU time (-);
* whether the process has a low nice value (i.e., > 0)
(+);
* whether the process is privileged (-); and
* whether the process is making direct hardware access
(-).
The oom_score also reflects the bit-shift adjustment
specified by the oom_adj setting for the process.
You can adjust the oom_score
file for your Python program if you want it to be the one that is killed.
Probably the better approach is adding more swap to your system to try to push off the time when the OOM-killer is invoked. Granted, having more swap doesn't necessarily mean that your system will never run out of memory -- and you might not care for the way it handles if there is a lot of swap traffic -- but it can at least get you past tight memory problems.
If you've already allocated all the space available for swap partitions, you can add swap files. Because they go through the filesystem, there is more overhead for swap files than swap partitions, but you can add them after the drive is partitioned, making it an easy short-term solution. You use the dd(1)
command to allocate the file (do not use seek
to make a sparse file) and then use mkswap(8)
to format the file for swap use, then use swapon(8)
to turn on that specific file. (I think you can even add swap files to fstab(5)
to make them automatically available at next reboot, too, but I've never tried and don't know the syntax.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With