Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python and C++ sharing the same memory resources

Tags:

c++

python

memory

Let's say we are using Python and calling some DLL libraries written in C++. We open a very large dataset in Python and then we would like to call a library written in C++ and add an array with that opened data as a parameter. Library would do something with that array and then return it back to Python code.

So the question is: Is it possible to use the same location of a memory? Because in that case we do not need to copy a huge amount of data two times.

like image 798
Matphy Avatar asked Feb 13 '18 22:02

Matphy


People also ask

Can you mix C and Python?

Extending Python with C or C++ It is quite easy to add new built-in modules to Python, if you know how to program in C. Such extension modules can do two things that can't be done directly in Python: they can implement new built-in object types, and they can call C library functions and system calls.

Does Python store data in memory?

Python uses a garbage collection algorithm (called Garbage Collector) that keeps the Heap memory clean and removes objects that are not needed anymore. You don't need to mess with the Heap, but it is better to understand how Python manages the Heap since most of your data is stored in this section of the memory.

Can Python write to memory?

In some cases, data reading and writing are not necessarily in files, but it needs to be read and written in memory. Python has a built-in module named StringIO . It can produce file-like objects (also known as memory files or string buffer) and be read and written just like files.


2 Answers

It all comes down to how you load your data in memory and what type of data it is.

If it's numerical data and you use e.g. a numpy array, it's are already stored with a memory layout trivially usable from C or C++ code. It's easy to obtain the address of the block of data (numpy.ndarray.ctypes.data) and pass it to the C++ code through ctypes. You can see a nice example here. Image data is similar in this regard (PIL images are in a simple memory format and the pointer to their data can be obtained easily).

If, on the other hand, your data is in regular "native" Python structures (e.g. regular lists or regular objects), the situation is more tricky. You can pass them straight to C++ code, but it's code that must know about Python data structures - so, written especially for this purpose, using python.h and dealing with the non-trivial Python API.

like image 186
Matteo Italia Avatar answered Oct 24 '22 11:10

Matteo Italia


This works using memory mapped files. I do not claim high speed or efficiency in any way. These are just to show an example of it working.

 $ python --version
 Python 3.7.9

 $ g++ --version
 g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

The C++ side only monitors the values it needs. The Python side only provides the values.

Note: the file name "pods.txt" must be the same in the C++ and python code.

#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
 
int main(void)
  {
  // assume file exists
  int fd = -1;
  if ((fd = open("pods.txt", O_RDWR, 0)) == -1)
     {
     printf("unable to open pods.txt\n");
     return 0;
     }
  // open the file in shared memory
  char* shared = (char*) mmap(NULL, 8, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

  // periodically read the file contents
  while (true)
      {
      printf("0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X 0x%02X\n", shared[0], shared[1], shared[2], shared[3], shared[4], shared[5],           shared[6], shared[7]);
      sleep(1);
      }

   return 0;
   }

The python side:

import mmap
import os
import time
 
fname = './pods.txt'
if not os.path.isfile(fname):
    # create initial file
    with open(fname, "w+b") as fd:
         fd.write(b'\x01\x00\x00\x00\x00\x00\x00\x00')

# at this point, file exists, so memory map it
with open(fname, "r+b") as fd:
    mm = mmap.mmap(fd.fileno(), 8, access=mmap.ACCESS_WRITE, offset=0)

    # set one of the pods to true (== 0x01) all the rest to false
    posn = 0
    while True:
         print(f'writing posn:{posn}')

         # reset to the start of the file
         mm.seek(0)
 
         # write the true/false values, only one is true
         for count in range(8):
             curr = b'\x01' if count == posn else b'\x00'
             mm.write(curr)

         # admire the view
         time.sleep(2)

         # set up for the next position in the next loop
        posn = (posn + 1) % 8

    mm.close()
    fd.close()

To run it, in terminal #1:

 a.out  # or whatever you called the C++ executable
 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00

i.e. you should see the 0x01 move one step every couple of seconds because of the sleep(2) in the C++ code.

in terminal #2:

python my.py  # or whatever you called the python file
writing posn:0
writing posn:1
writing posn:2

i.e. you should see the position change from 0 through 7 back to 0 again.

like image 36
JohnA Avatar answered Oct 24 '22 11:10

JohnA