Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Segmentation fault when creating multiprocessing array

I'm trying to fill a numpy array using multiprocessing, following this post. What I have works fine on my Mac, but when I port it to Ubuntu I get segmentation faults a lot of the time.

I've reduced the code to following minimal example:

import numpy as np
from multiprocessing import sharedctypes

a = np.ctypeslib.as_ctypes(np.zeros((224,224,3)))
print("Made a, now making b")
b = sharedctypes.RawArray(a._type_, a)
print("Finished.")

On Ubuntu 16.04, with Python 3.6.5 and numpy 1.15.4 (same versions as on my Mac), I get the output

Made a, now making b
Segmentation fault (core dumped)

Now, I can change the array dimensions somewhat and in some cases it'll work (e.g., change the first 224 to 100 and it works). But mostly it seg faults.

Can anyone offer any insight?

I see one post on a related topic from 2016 that no one responded to, and another one involving pointers which I'm not using.

PS- It doesn't seem to make any difference whether I specify a as a multidimensional array or as a flattened array (e.g. np.zeros(224*224*3)). It also doesn't seem to make a difference if I change the data type (e.g. float to int); it fails the same.

One further update: Even setting "size=224" in the code from the original post causes seg faults on two different Ubuntu machines with different versions of numpy, but works fine on Mac.

like image 505
sh37211 Avatar asked Dec 13 '18 08:12

sh37211


2 Answers

This is more of a guess than an answer, but you may be running into an issue owing to garbage collection of the underlying data buffer. This may explain why there seems to be a dependence on the overall size of the array you're trying to create.

If that's the case, then the fix would be to assign the Numpy array of zeros that you create to it's own variable. This would ensure that the buffer "lives" through the creation of the RawArray. The code would then be:

zs = np.zeros((224,224,3))
a = np.ctypeslib.as_ctypes(zs)
print("Made a, now making b")
b = sharedctypes.RawArray(a._type_, a)
print("Finished.")

I only have a mac right now, so I can't test this out myself.

like image 66
tel Avatar answered Oct 23 '22 21:10

tel


Additional analysis and root-cause fix.

As pointed out above, this is the result of a garbage collection bug, this gave me a hint as to how to fix it.

By keeping the reference around to the original np.zeros object, the bug was avoided. This meant (to me) that the collection of the original object corrupted the resulting array.

Looking at the implementation of as_ctypes (taken from c52543e4a)

def as_ctypes(obj):
    """Create and return a ctypes object from a numpy array.  Actually
    anything that exposes the __array_interface__ is accepted."""
    ai = obj.__array_interface__
    if ai["strides"]:
        raise TypeError("strided arrays not supported")
    if ai["version"] != 3:
        raise TypeError("only __array_interface__ version 3 supported")
    addr, readonly = ai["data"]
    if readonly:
        raise TypeError("readonly arrays unsupported")
    tp = _ctype_ndarray(_typecodes[ai["typestr"]], ai["shape"])
    result = tp.from_address(addr)
    result.__keep = ai
    return result

it's evident that the original author thought of this (assigning .__keep to maintain a reference to the original object). However, it seems they need to keep a reference to the original object.

I've written a patch which does this:

-        result.__keep = ai
+        result.__keep = obj
like image 22
Anthony Sottile Avatar answered Oct 23 '22 21:10

Anthony Sottile