Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using swig to bind google protocol buffers

I'm writing python program that needs to process a lot of small but complex protobuf-encoded messages. I tried to use the Python implementation of protocol buffers, which is written in pure python, but its performance is really terrible.

So I'm looking into a solution that apparently some folks got to work - use protoc to generate C++ files, then use swig to wrap them with python. The problem is that I can't get to a working Python module.

  • When running swig with -includeall, to ensure that all the Google base/utility classes used by the generated message classes also get wrapped - swig fails, complaining about missing system include files (e.g. "string"). I couldn't work around this with -I flags or copying of entire include directories. The environment is Ubuntu 10.04, protobuf 2.2.0, swig 1.3.40, gcc 4.4.3.

  • Without this flag, I'm able to generate a python module for my message classes, but this module is useless: the generated Python message classes are missing all the functions provided by the Message base class - in particular all but one of the de-serialization methods. The one method left (MergePartialFromCodedStream) won't run, because it requires an input stream of type CodedInputStream (which is part of the protobuf infrastructure and was therefore not wrapped with swig).

I was wondering if anyone has a working example of getting swig to work on top of protobuf-C++?

Alternatively - is there an example of some other solution, such as the Python extension mentioned in the same page? Though that seems like a high-maintenance solution for my dynamic schema...

If none of this works I'm considering dropping python in favor of Groovy - assuming that the Java implementation of protocol buffers would be more efficient. Any comment on that?

Muchas Gracias!

like image 774
django dude Avatar asked Jul 12 '11 20:07

django dude


People also ask

Does Google use protobuf?

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.

Does protobuf use HTTP?

Protobufs work fine over HTTP in their native binary format.

Is protobuf a communication protocol?

Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a network or for storing data.

What is protocol buffers gRPC?

Protocol Buffers gRPC services and messages between clients and servers are defined in proto files. The Protobuf compiler, protoc, generates client and server code that loads the . proto file into the memory at runtime and uses the in-memory schema to serialize/deserialize the binary message.


2 Answers

New version of Protobuf supports using the fast C++ implementation of Protobuf with Python code. Set the environment variable PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp.

like image 77
Sandeep Avatar answered Oct 03 '22 13:10

Sandeep


Here is the correct link to the Greplin fast-python-pb solution which I ended up using. It's very easy to use (at least in Linux), and performance is x100 times up.

This software is still young and not 100% compatible with the Google implementation, at least with regard to empty values in optional fields - but the differences are pretty minor.

like image 32
django dude Avatar answered Oct 03 '22 11:10

django dude