Google protocol buffers huge in python

Tags:

I started using the protocol buffer library, but noticed that it was using huge amounts of memory. pympler.asizeof shows that a single one of my objects is about 76k! Basically, it contains a few strings, some numbers, and some enums, and some optional lists of same. If I were writing the same thing as a C-struct, I would expect it to be under a few hundred bytes, and indeed the ByteSize method returns 121 (the size of the serialized string).

Is that you expect from the library? I had heard it was slow, but this is unusable and makes me more inclined to believe I'm misusing it.

Edit

Here is an example I constructed. This is a pb file similar, but simpler than what I've been using

    package pb;

message A {
    required double a       = 1;
}

message B {
    required double b       = 1;
}

message C {
    required double c       = 1;
    optional string s       = 2;
}

message D {
    required string d       = 1;
    optional string e       = 2;
    required A a            = 3;
    optional B b            = 4;
    repeated C c            = 5;
}

And here I am using it

>>> import pb_pb2
>>> a = pb_pb2.D()
>>> a.d = "a"
>>> a.e = "e"
>>> a.a.a = 1
>>> a.b.b = 2
>>> c = a.c.add()
>>> c.c = 5
>>> c.s = "s"
>>> import pympler.asizeof
>>> pympler.asizeof.asizeof(a)
21440
>>> a.ByteSize()
42

I have version 2.2.0 of protobuf (a bit old at this point), and python 2.6.4.

330

asked Aug 08 '11 19:08

pythonic metaphor

1 Answers

Object instances have a bigger memory footprint in python than in compiled languages. For example, the following code, which creates very simple classes mimicking your proto displays 1440:

class A:
  def __init__(self):
    self.a = 0.0

class B:
  def __init__(self):
    self.b = 0.0

class C:
  def __init__(self):
    self.c = 0.0
    self.s = ""

class D:
  def __init__(self):
    self.d = ""
    self.e = ""
    self.e_isset = 1
    self.a = A()
    self.b = B()
    self.b_isset = 1
    self.c = [C()]

d = D()
print asizeof(d)

I am not surprised that protobuf's generated classes take 20 times more memory, as they add a lot of boiler plate.

The C++ version surely doesn't suffer from this.

answered Sep 19 '22 07:09

Jerome

Related questions
                            
                                PYTHONPATH order on Ubuntu 14.04
                            
                                PyTest-Django Failing on missing django_migration table
                            
                                Does Python have an equivalent to Haskell's 'mask' or 'bracket' functions?
                            
                                Training of keras model get's slower after each repetition
                            
                                Computing the "closure" of the attributes of an object given functions that change the attributes
                            
                                Is there a way to use tensorflow map_fn on GPU?
                            
                                Keras custom loss implementation : ValueError: An operation has `None` for gradient
                            
                                Jenkins Job - DatabaseError: file is encrypted or is not a database
                            
                                install caffe on mac " Error: invalid option: --with-python"
                            
                                Missing elements when using selenium chrome driver to automatically 'Save as PDF'
                            
                                Tensorflow 2.0 Keras is training 4x slower than 2.0 Estimator
                            
                                How to stop Pandas DataFrame from converting int to float for no reason?
                            
                                Why would I get a Forbidden message from AWS API Gateway, even though things are working internally?
                            
                                Why does my keras LSTM model get stuck in an infinite loop?
                            
                                Implementation of the Dense Synthesizer
                            
                                Testing for MongoDB Functionality using Motor AsyncIO and Pytest
                            
                                Model limit_choices_to={'user': user}
                            
                                Django newbie deployment question - ImportError: Could not import settings 'settings'
                            
                                How to use Emacs with Python
                            
                                how to extract a unicode string with boost.python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Google protocol buffers huge in python

Tags:

python

protocol-buffers

pythonic metaphor

People also ask

1 Answers

Jerome

Recent Activity

Donate For Us