Let's say we are using Python and calling some DLL libraries written in C++. We open a very large dataset in Python and then we would like to call a library written in C++ and add an array with that opened data as a parameter. Library would do something with that array and then return it back to Python code.
So the question is: Is it possible to use the same location of a memory? Because in that case we do not need to copy a huge amount of data two times.
Extending Python with C or C++ It is quite easy to add new built-in modules to Python, if you know how to program in C. Such extension modules can do two things that can't be done directly in Python: they can implement new built-in object types, and they can call C library functions and system calls.
Python uses a garbage collection algorithm (called Garbage Collector) that keeps the Heap memory clean and removes objects that are not needed anymore. You don't need to mess with the Heap, but it is better to understand how Python manages the Heap since most of your data is stored in this section of the memory.
In some cases, data reading and writing are not necessarily in files, but it needs to be read and written in memory. Python has a built-in module named StringIO . It can produce file-like objects (also known as memory files or string buffer) and be read and written just like files.
It all comes down to how you load your data in memory and what type of data it is.
If it's numerical data and you use e.g. a numpy array, it's are already stored with a memory layout trivially usable from C or C++ code. It's easy to obtain the address of the block of data (numpy.ndarray.ctypes.data
) and pass it to the C++ code through ctypes. You can see a nice example here. Image data is similar in this regard (PIL images are in a simple memory format and the pointer to their data can be obtained easily).
If, on the other hand, your data is in regular "native" Python structures (e.g. regular lists or regular objects), the situation is more tricky. You can pass them straight to C++ code, but it's code that must know about Python data structures - so, written especially for this purpose, using python.h
and dealing with the non-trivial Python API.
This works using memory mapped files. I do not claim high speed or efficiency in any way. These are just to show an example of it working.
$ python --version
Python 3.7.9
$ g++ --version
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
The C++ side only monitors the values it needs. The Python side only provides the values.
Note: the file name "pods.txt" must be the same in the C++ and python code.
#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(void)
{
// assume file exists
int fd = -1;
if ((fd = open("pods.txt", O_RDWR, 0)) == -1)
{
printf("unable to open pods.txt\n");
return 0;
}
// open the file in shared memory
char* shared = (char*) mmap(NULL, 8, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
// periodically read the file contents
while (true)
{
printf("0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X\n", shared[0], shared[1], shared[2], shared[3], shared[4], shared[5], shared[6], shared[7]);
sleep(1);
}
return 0;
}
The python side:
import mmap
import os
import time
fname = './pods.txt'
if not os.path.isfile(fname):
# create initial file
with open(fname, "w+b") as fd:
fd.write(b'\x01\x00\x00\x00\x00\x00\x00\x00')
# at this point, file exists, so memory map it
with open(fname, "r+b") as fd:
mm = mmap.mmap(fd.fileno(), 8, access=mmap.ACCESS_WRITE, offset=0)
# set one of the pods to true (== 0x01) all the rest to false
posn = 0
while True:
print(f'writing posn:{posn}')
# reset to the start of the file
mm.seek(0)
# write the true/false values, only one is true
for count in range(8):
curr = b'\x01' if count == posn else b'\x00'
mm.write(curr)
# admire the view
time.sleep(2)
# set up for the next position in the next loop
posn = (posn + 1) % 8
mm.close()
fd.close()
To run it, in terminal #1:
a.out # or whatever you called the C++ executable
0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00
i.e. you should see the 0x01 move one step every couple of seconds because of the sleep(2) in the C++ code.
in terminal #2:
python my.py # or whatever you called the python file
writing posn:0
writing posn:1
writing posn:2
i.e. you should see the position change from 0 through 7 back to 0 again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With