Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Segfault when import_array not in same translation unit

I'm having problems getting the NumPy C API to properly initialize. I think I've isolated the problem to calling import_array from a different translation unit, but I don't know why this should matter.

Minimal working example:

header1.hpp

#ifndef HEADER1_HPP
#define HEADER1_HPP
#include <Python.h>
#include <numpy/npy_3kcompat.h>
#include <numpy/arrayobject.h>

void initialize();

#endif

file1.cpp

#include "header1.hpp"

void* wrap_import_array()
{
  import_array();
  return (void*) 1;
}

void initialize()
{
  wrap_import_array();
}

file2.cpp

#include "header1.hpp"

#include <iostream>

void* loc_wrap_import_array()
{
  import_array();
  return (void*) 1;
}

void loc_initialize()
{
  loc_wrap_import_array();
}

int main()
{
  Py_Initialize();
#ifdef USE_LOC_INIT
  loc_initialize();
#else
  initialize();
#endif
  npy_intp dims[] = {5};
  std::cout << "creating descr" << std::endl;
  PyArray_Descr* dtype = PyArray_DescrFromType(NPY_FLOAT64);
  std::cout << "zeros" << std::endl;
  PyArray_Zeros(1, dims, dtype, 0);
  std::cout << "cleanup" << std::endl;
  return 0;
}

Compiler commands:

g++ file1.cpp file2.cpp -o segissue -lpython3.4m -I/usr/include/python3.4m -DUSE_LOC_INIT
./segissue
# runs fine

g++ file1.cpp file2.cpp -o segissue -lpython3.4m -I/usr/include/python3.4m
./segissue
# segfaults

I've tested this with Clang 3.6.0, GCC 4.9.2, Python 2.7, and Python 3.4 (with a suitably modified wrap_import_array because this is different between Python 2.x and 3.x). The various combinations all give the same result: if I don't call loc_initialize, the program will segfault in the PyArray_DescrFromType call. I have NumPy version 1.8.2. For reference, I'm running this in Ubuntu 15.04.

What baffles me most of all is this C++ NumPy wrapper appears to get away with calling import_array in a different translation unit.

What am I missing? Why must I call import_array from the same translation unit in order for it to actually take effect? More importantly, how do I get it to work when I call import_array from a different translation unit like the Boost.NumPy wrapper does?

like image 571
helloworld922 Avatar asked Aug 12 '15 16:08

helloworld922


1 Answers

After digging through the NumPy headers, I think I've found a solution:

in numpy/__multiarray_api.h, there's a section dealing with where an internal API buffer should be. For conciseness, here's the relevant snippet:

#if defined(PY_ARRAY_UNIQUE_SYMBOL)
#define PyArray_API PY_ARRAY_UNIQUE_SYMBOL
#endif

#if defined(NO_IMPORT) || defined(NO_IMPORT_ARRAY)
extern void **PyArray_API;
#else
#if defined(PY_ARRAY_UNIQUE_SYMBOL)
void **PyArray_API;
#else
static void **PyArray_API=NULL;
#endif
#endif

It looks like this is intended to allow multiple modules define their own internal API buffer, in which each module must call their own import_array define.

A consistent way to get several translation units to use the same internal API buffer is in every module, define PY_ARRAY_UNIQUE_SYMBOL to some library unique name, then every translation unit other than the one where the import_array wrapper is defined defines NO_IMPORT or NO_IMPORT_ARRAY. Incidentally, there are similar macros for the ufunc features: PY_UFUNC_UNIQUE_SYMBOL, and NO_IMPORT/NO_IMPORT_UFUNC.

The modified working example:

header1.hpp

#ifndef HEADER1_HPP
#define HEADER1_HPP

#ifndef MYLIBRARY_USE_IMPORT
#define NO_IMPORT
#endif

#define PY_ARRAY_UNIQUE_SYMBOL MYLIBRARY_ARRAY_API
#define PY_UFUNC_UNIQUE_SYMBOL MYLIBRARY_UFUNC_API

#include <Python.h>
#include <numpy/npy_3kcompat.h>
#include <numpy/arrayobject.h>

void initialize();

#endif

file1.cpp

#define MYLIBRARY_USE_IMPORT
#include "header1.hpp"

void* wrap_import_array()
{
  import_array();
  return (void*) 1;
}

void initialize()
{
  wrap_import_array();
}

file2.cpp

#include "header1.hpp"

#include <iostream>

int main()
{
  Py_Initialize();
  initialize();
  npy_intp dims[] = {5};
  std::cout << "creating descr" << std::endl;
  PyArray_Descr* dtype = PyArray_DescrFromType(NPY_FLOAT64);
  std::cout << "zeros" << std::endl;
  PyArray_Zeros(1, dims, dtype, 0);
  std::cout << "cleanup" << std::endl;
  return 0;
}

I don't know what pitfalls there are with this hack or if there are any better alternatives, but this appears to at least compile and run without any segfaults.

like image 120
helloworld922 Avatar answered Oct 25 '22 10:10

helloworld922