For performance reasons I want to port parts of my python program to C++ and I therefore try to write a simple extension for my program. The C++ part will build a dictionary, which then needs to be delivered to the Python program.
One way I found seems to be to build my dict-like object in C++, e.g. a boost::unordered_map
, and then translate it to Python using the Py_BuildValue
[1] method, which is able to produce Python dicts. But this method which includes converting the container into a string representation and back seems a bit too much 'around the corner' to be the most performant solution!?
So my question is: What is the most performant way to build a Python dictionary in C++? I saw that boost has a Python library which supports mapping containers between C++ and Python, but I didn't find the exact thing I need in the documentation so far. If there is such way I would prefer to directly build a Python dict in C++, so that no copying etc. is needed. But if the most performant way to do this is another one, I'm good with that too.
Here is the (simplified) C++-code I compile into a .dll/.pyd:
#include <iostream>
#include <string>
#include <Python.h>
#include "boost/unordered_map.hpp"
#include "boost/foreach.hpp"
extern "C"{
typedef boost::unordered_map<std::string, int> hashmap;
static PyObject*
_rint(PyObject* self, PyObject* args)
{
hashmap my_hashmap; // DO I NEED THIS?
my_hashmap["a"] = 1; // CAN I RATHER INSERT TO PYTHON DICT DIRECTLY??
BOOST_FOREACH(hashmap::value_type i, my_hashmap) {
// INSERT ELEMENT TO PYTHON DICT
}
// return PYTHON DICT
}
static PyMethodDef TestMethods[] = {
{"rint", _rint, METH_VARARGS, ""},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC
inittest(void)
{
Py_InitModule("test", TestMethods);
}
} // extern "C"
This I want to use in Python like:
import test
new_dict = test.rint()
The dictionary will map strings to integers. Thanks for any help!
Section 6.6 of The C Programming Language presents a simple dictionary (hashtable) data structure. I don't think a useful dictionary implementation could get any simpler than this. For your convenience, I reproduce the code here. Note that if the hashes of two strings collide, it may lead to an O(n) lookup time.
They are called hash tables or hash maps. There are lots of std ones for C++. Save this answer.
Analysis Of The Test Run ResultA dictionary is 6.6 times faster than a list when we lookup in 100 items.
The list is an ordered collection of data, whereas the dictionaries store the data in the form of key-value pairs using the hashtable structure. Due to this, fetching the elements from the list data structure is quite complex compared to dictionaries in Python. Therefore, the dictionary is faster than a list in Python.
PyObject *d = PyDict_New() for (...) { PyDict_SetItem(d, key, val); } return d;
__setitem__
and __getitem__
. In both method, use your original hashmap. At the end, no copy will happen!If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With