Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialize C++ object to send via sockets to Python - best approach?

I need to create a network communication between two different frameworks, one written in C++ and the other in Python.

To exchange data, I want to create some kind of flexible structure (basically a struct) in C++, which is serialised, sent through sockets to Python and then deserialised.

What is the most common way to do this? I'm sure that Boost could do it on either side, since there is boost python, but I don't want to blow up the project requirements that much. So is there maybe a smaller library or whatever another elegant solution except specifying an own binary data format?

UPDATE:

So here is one example how to use Googles protobuf to send a data structure from a C++ script to a Python script via UDP. This is tested on Mac OS X Mavericks but should work fine on other Unix systems too.

Installing protobuf

The first step is of course installing the protobuf library. I used homebrew for the main library and pip to install the Python modules:

brew install protobuf
pip install protobuf

Then I defined a very simple data structure using the proto-syntax:

Filename: foo.proto

package prototest;

message Foo {
  required int32 id = 1;
  required string bar = 2;
  optional string baz = 3;
}

This proto-file can now be translated into C++ and Python classes via:

protoc foo.proto --cpp_out=. --python_out=.

The folder should now contain the C++ header and source files and the Python code:

├── foo.pb.cc
├── foo.pb.h
├── foo.proto
└── foo_pb2.py

Let's have a look at the very basic C++ code, which is meant to send an instance of foo over the network, using UDP (to localhost on port 5555):

Filename: send.cc

#include <sys/socket.h>
#include <arpa/inet.h>

// this is our proto of foo
#include "foo.pb.h"

int main(int argc, char **argv)
{
  struct sockaddr_in addr;

  addr.sin_family = AF_INET;
  inet_aton("127.0.0.1", &addr.sin_addr);
  addr.sin_port = htons(5555);

  // initialise a foo and set some properties
  GOOGLE_PROTOBUF_VERIFY_VERSION;
  prototest::Foo foo;
  foo.set_id(4);
  foo.set_bar("narf");

  // serialise to string, this one is obvious ; )    
  std::string buf;
  foo.SerializeToString(&buf);

  int sock = socket(PF_INET, SOCK_DGRAM, 0);
  sendto(sock, buf.data(), buf.size(), 0, (struct sockaddr *)&addr, sizeof(addr));

  return 0;
}

I compiled it via clang++:

clang++ -o send send.cc foo.pb.cc -lprotobuf

And finally, this is the Python code, which waits for UDP packets and deserialise them into foo. Again: no error checking whatsoever, this is only to demonstrate the functionality:

Filename: receive.py

import socket
from foo_pb2 import Foo

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(("127.0.0.1", 5555))

foo = Foo()
while True:
    data, addr = sock.recvfrom(1024)
    foo.ParseFromString(data)
    print("Got foo with id={0} and bar={1}".format(foo.id, foo.bar))

Now we're done and this is the final directory structure:

├── foo.pb.cc
├── foo.pb.h
├── foo.proto
├── foo_pb2.py
├── receive.py
├── send
└── send.cc

To test the script, simply run receive.py to listen to UDP packets via

python receive.py

and keep your eyes on the output when you execute the C++ generated send script:

./send
like image 760
tamasgal Avatar asked May 21 '14 13:05

tamasgal


People also ask

Which method is used for object serialization in Python?

The pickle module implements binary protocols for serializing and de-serializing a Python object structure.

Why do we serialize objects in Python?

Serialization refers to the process of converting a data object (e.g., Python objects, Tensorflow models) into a format that allows us to store or transmit the data and then recreate the object when needed using the reverse process of deserialization.

Which Python module is used to serialize and deserialize the Python objects?

The Python pickle module is a better choice for serialization and deserialization of python objects. If you don't need a human-readable format or if you need to serialize custom objects then it is recommended to use the pickle module.


2 Answers

Protocol Buffers' successor, Cap'n Proto, also has good support for C++ and Python. (Disclosure: I am the author of Cap'n Proto, and also was the author of most of the Protobuf code released by Google.)

like image 182
Kenton Varda Avatar answered Oct 19 '22 19:10

Kenton Varda


Go for Protocol Buffers - Google Code which has enough resource for c++ and python. You can make a compatible structure which is readable in both languages.

Protocol Buffers are a method of serializing structured data. As such, they are useful in developing programs to communicate with each other over a wire or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates from that description source code in various programming languages for generating or parsing a stream of bytes that represents the structured data. †

like image 3
masoud Avatar answered Oct 19 '22 19:10

masoud