Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ vector to Python 3.3

I would like to get a python list, say, [1,2,3,4], from a C++ script. I wrote the C++ script, which returns a vector.

How to connect the ends without SWIG/SIP/Cython/and others?

Could it be easier to just compile the C++ to an .exe or elf file and then call from command line, have the .exe create a .txt containing a vector and read it in with python?

My point is, I only need a really small function from C++ to do the heavy calculations on huge data. What would be the least painful and shortest method to do just this?

EDIT: To give an example. Python will give a filename string to C++ ("foo.txt"), which will then read the context of the file (200,000 rows by 300 columns), count the missings and then return to Python the amount of missings per row. This yields a list of 200,000 numbers. How to have this communication between both of them?

Just for completeness, this is what I am still wondering about how to go about:

  • Pass python filename string to C++
  • Receive python string in C++
  • DONE Create vector in C++
  • Return vector to Python
  • Receive vector in Python
like image 510
PascalVKooten Avatar asked May 22 '13 13:05

PascalVKooten


2 Answers

This is probably moot now, and I posted something similar on your other question, but I've adapted this version for Python 3.3 and C++ rather than Python 2.7 and C.

If you want to get back a Python list object, and since you're building a list which could potentially be very long (200,000 items), it's probably more efficient to build the Python list in the C++ code, rather than building a std::vector and then converting that to a Python list later on.

Based on the code in your other question, I'd suggest using something like this...

// foo.cpp
#include <python3.3/Python.h>
#include <fstream>
#include <string>
using namespace std;

extern "C"
{
    PyObject* foo(const char* FILE_NAME)
    {
        string line;
        ifstream myfile(FILE_NAME);
        PyObject* result = PyList_New(0);

        while (getline(myfile, line))
        {
            PyList_Append(result, PyLong_FromLong(1));
        }

        return result;
    }
}

...compiled with...

$ g++ -fPIC -shared -o foo.so foo.cpp -lpython3.3m

...and an example of usage...

>>> from ctypes import *
>>> foo = CDLL('./foo.so')
>>> foo.foo.restype = py_object
>>> foo.foo(b'foo.cpp')
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

...although if you need to convert an existing std::vector to a Python list, you can pre-allocate the memory needed by the Python list by passing the length of the vector into PyList_New(), and then use PyList_SetItem() instead of PyList_Append().

The only other methods I can think of would be...

  1. To pre-allocate a block of RAM in Python, and have the C++ function fill in the values, like in qarma's answer, but you'd have to know in advance how much RAM to allocate. You could just pick an arbitrary value, but given that the number of lines in the file isn't known in advance, this number may be way too large or way too small.

  2. To heap-allocate the std::vector in C++, and return a pointer to the first element, and the number of elements, but you'd have to write a second function to free the RAM once you were done with it.

Either way, you still have the overhead of converting the 'returned' array into a Python list, so you may as well do it yourself.

like image 85
Aya Avatar answered Oct 20 '22 00:10

Aya


Define your entry point extern "C" and use ctypes.

Here's an example to get you started, data is passed from Python, C++ code sorts the data, and Python gets back the result:

#include <sys/types.h>
#include <algorithm>

extern "C" {
    void foo(float* arr, size_t len);
}

void foo(float* arr, size_t len)
{
    // if arr is input, convert to C++ array

    // crazy C++ code here
    std::sort(arr, arr+len);

    // if arr is output, convert C++ array to arr
}

Compile your code into a shared object (libxxx.so on linux, libxxx.dll on win, libxxx.dylib on osx), then load it dynamically and pass data in/out via ctypes:

import ctypes
import posix

# Darwin .dylib; Linux .so; Windows .dll; use sys.platform() for runtime detection
libxxx = ctypes.CDLL("./libxxx.so")
libxxx.foo.argtypes = [ctypes.POINTER(ctypes.c_float), ctypes.c_size_t]
libxxx.foo.restype = None

data = ctypes.ARRAY(ctypes.c_float, 100)()

# write someting into data[x]
import random
for i in range(100): data[i] = random.random()
print data[:3], "...", data[-3:]

libxxx.foo(data, len(data))

# read out from data[x]
print data[:3], "...", data[-3:]

Great thing about ctypes is that it's bundled with Python since 2.5, you don't need any additional libraries.

If you want to use something more advanced, have a look at cffi.

like image 29
Dima Tisnek Avatar answered Oct 20 '22 01:10

Dima Tisnek