Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I troubleshoot a segmentation fault when working with Python Ctypes and C++?

Tags:

c++

python

ctypes

Let's say I have the following two function signatures in C++:

BYTE* init( BYTE* Options, BYTE* Buffer )

and:

int next( BYTE* interface, BYTE* Buffer )

The idea is that I first initialize an Interface class in C++, then subsequently call the next function from Python, with a reference to that class.

The first function returns a BYTE pointer to the Interface via:

Interface*  interface;
// initialize stuff
return((BYTE*) interface);

I call it in Python like this:

class Foo:
  def init(self, data):
    # left out: setting options_ptr
    buf = (c_ubyte * len(data.bytes)).from_buffer_copy(data.bytes)
    init_fun = getattr(self.dll, '?init@@YAPAEPAE0HH@Z')
    init_fun.restype = POINTER(c_ubyte)
    self.interface_ptr = init_fun(options_ptr, buf)
    # this works fine!

  def next(self, data):
    # create buf from other data
    buf = (c_ubyte * len(data.bytes)).from_buffer_copy(data.bytes)
    next_fun = getattr(self.dll, '?next@@YAHPAE0HN@Z')
    ret = next_fun(self.interface_ptr, buf)
    # I randomly get segmentation faults here

I call this from outside with, e.g.:

foo = Foo()
foo.init(some_data)
foo.next(some_other_data)
# ...
foo.next(some_additional_data)

Now, when I run it, I get segmentation faults:

[1]    24712 segmentation fault  python -u test.py

Sometimes it happens after the first call to .next(), sometimes it happens after the eleventh call to .next()—totally at random.

There is a C++ test code for the API that works something like this:

BYTE Buffer[500000];
UTIN BufSize=0;
BYTE* Interface;

# not shown here: fill buffer with something
Interface = init(Buffer);
while(true) {
    # not shown here: fill buffer with other data
    int ret = next(Interface, Buffer);
}

Now, as I cannot show the exact code, since it's much bigger and proprietary, the question is: How can I troubleshoot such a segmentation fault? I can break when the exception is thrown (when debugging with VS2012), but it breaks here:

Clearly, that's not useful because nothing is actually done with any buffer at the indicated line. And the variable values are cryptic too:

In my case data is a BitString object. Could it be the problem if the C++ code does memory operations on the buffer passed? Or that some data is garbage-collected by Python when it's still needed?

More generally, how can I ensure not getting segmentation faults when working with Ctypes? I know that the underlying DLL API works fine and doesn't crash.


Update: When I make buf an instance variable, e.g. self._buf, I get a segmentation fault, but it breaks at a different location during debugging:

like image 954
slhck Avatar asked Sep 28 '22 07:09

slhck


1 Answers

There were a few misunderstandings I had, all of which led to the problems:

  • When you create a Ctypes object in Python and pass it to a C function, and that Python object is no longer needed, it is (probably) garbage-collected and no longer in the memory stack where C expects it to be.

    Therefore, make the buffer an instance variable, e.g. self._buf.

  • The C functions expect the data to be mutable. If the C functions do not actually copy the data somewhere else but work on the buffer directly, it needs to be mutable. The Ctypes documentation specifies this:

    Assigning a new value to instances of the pointer types c_char_p, c_wchar_p, and c_void_p changes the memory location they point to, not the contents of the memory block (of course not, because Python strings are immutable).

    You should be careful, however, not to pass them to functions expecting pointers to mutable memory. If you need mutable memory blocks, ctypes has a create_string_buffer() function which creates these in various ways. The current memory block contents can be accessed (or changed) with the raw property; if you want to access it as NUL terminated string, use the value property:

    So, I did something like this:

    self._buf = create_string_buffer(500000)
    self._buf.value = startdata.bytes
  • The buffer should be used in Python like a normal array as shown in the example code, where it's filled and data inside is manipulated. So, for my .next() method, I did this:
    self._buf.value = nextdata.bytes

Now my program runs as expected.

like image 83
slhck Avatar answered Oct 05 '22 07:10

slhck