Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python script terminated by SIGKILL rather than throwing MemoryError

Update Again

I have tried to create some simple way to reproduce this, but have not been successful.

So far, I have tried various simple array allocations and manipulations, but they all throw an MemoryError rather than just SIGKILL crashing.

For example:

x =np.asarray(range(999999999))

or:

x = np.empty([100,100,100,100,7])

just throw MemoryErrors as they should.

I hope to have a simple way to recreate this at some point.

End Update

I have a python script running numpy/scipy and some custom C extensions.

On my Ubuntu 14.04 under Virtual Box, it runs to completion just fine.

On an Amazon EC2 T2 micro instance, it terminates (after running a while) with the output:

Killed

Running under the python debugger, the signal is not caught and the debugger exits as well.

Running under strace, I get:

munmap(0x7fa5b7fa6000, 67112960)        = 0
mmap(NULL, 67112960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5b7fa6000    
mmap(NULL, 67112960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5affa4000    
mmap(NULL, 67112960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5abfa3000    
mmap(NULL, 67637248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5a7f22000    
mmap(NULL, 67637248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5a3ea1000    
mmap(NULL, 67637248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa59fe20000    
gettimeofday({1406518336, 306209}, NULL) = 0    
gettimeofday({1406518336, 580022}, NULL) = 0    
+++ killed by SIGKILL +++

running under gdb while trying to catch "SIGKILL", I get:

[Thread 0x7fffe7148700 (LWP 28022) exited]

Program terminated with signal SIGKILL, Killed.
The program no longer exists.
(gdb) where
No stack.

running python's trace module (python -m trace --trace ), I get:

defmatrix.py(292):         if (isinstance(obj, matrix) and obj._getitem): return
defmatrix.py(293):         ndim = self.ndim
defmatrix.py(294):         if (ndim == 2):
defmatrix.py(295):             return
defmatrix.py(336):         return out
 --- modulename: linalg, funcname: norm
linalg.py(2052):     x = asarray(x)
 --- modulename: numeric, funcname: asarray
numeric.py(460):     return array(a, dtype, copy=False, order=order)

I can't think of anything else at the moment to figure out what is going on.

I suspect maybe it might be running out of memory (it is an AWS Micro instance), but I can't figure out how to confirm or deny that.

Is there another tool I could use that might help pinpoint exactly where the program is stopping? (or I am running one of the above tools the wrong way for this problem?)

Update

The Amazon EC2 T2 micro instance has no swap space defined by default, so I added a 4GB swap file and was able to run the program to completion.

However, I am still very interested in a way to have run the program such that it terminated with some message a little closer to "Not Enough Memory" rather than "Killed"

If anyone has any suggestions, they would be appreciated.

like image 982
rkh Avatar asked Jul 28 '14 17:07

rkh


People also ask

What is SIGKILL in Python?

SIGKILL is where the Python process is terminated by your system. Reasons I have seen this: Low resources (not enough RAM, usually) - monitor and see how much the program is using. You might also want to try explicitly setting n_jobs to a low number, as CPU over-subscription could be an issue.

Why is my Python script getting killed?

The most likely is that your program was using too much memory. Rather than risking things breaking when memory allocations started failing, the system sent a kill signal to the process that was using too much memory.

Why is my Python algorithm running out of memory?

In your example, you have to look for parts of your algorithm that could be consuming a lot of memory. If an operation runs out of memory it is known as memory error. If you get an unexpected Python Memory Error and you think you should have plenty of rams available, it might be because you are using a 32-bit python installation.

What is memory error in Python?

What is Memory Error? Python Memory Error or in layman language is exactly what it means, you have run out of memory in your RAM for your code to execute. When this error occurs it is likely because you have loaded the entire data into memory. For large datasets, you will want to use batch processing.

Why does Python exit a script when it is done?

Why does Python automatically exit a script when it’s done? The way Python executes a code block makes it execute each line in order, checking dependencies to import, reading definitions and classes to store in memory, and executing pieces of code in order allowing for loops and calls back to the defined definitions and classes.

How to exit a Python program gracefully without throwing exceptions?

When Python reaches the EOF condition at the same time that it has executed all the code without throwing any exceptions, which is one way Python may exit “gracefully.” If we want to tell when a Python program exits without throwing an exception, we can use the built-in Python atexit module.


1 Answers

It sounds like you've run into the dreaded Linux OOM Killer. When the system completely runs of out of memory and the kernel absolutely needs to allocate memory, it kills a process rather than crashing the entire system.

Look in the syslog for confirmation of this. A line similar to:

kernel: [884145.344240] mysqld invoked oom-killer:

followed sometime later with:

kernel: [884145.344399] Out of memory: Kill process 3318

Should be present (in this example, it mentions mysql specifically)

You can add these lines to your /etc/sysctl.conf file to effectively disable the OOM killer:

vm.overcommit_memory = 2
vm.overcommit_ratio = 100

And then reboot. Now, the original, memory hungry, process should fail to allocate memory and, hopefully, throw the proper exception.

Setting overcommit_memory means that Linux won't over commit memory, meaning memory allocations will fail if there isn't enough memory for them. See this answer for details on what effect the overcommit_ratio has: https://serverfault.com/a/510857

like image 151
Ross Ridge Avatar answered Oct 05 '22 03:10

Ross Ridge