I currently interface to a server that provides protocol buffers. I can potentially receive a very large number of messages. Currently my process to read the protocol buffers and convert them to a Pandas DataFrame (not a necessary step in general, but Pandas offers nice tools for analyzing datasets) is:
pandas.DataFrame.from_records
to get a DataFrameThis works great, but, given the large number of messages I read from the protobuf, it is quite inefficient to convert to dictionary and then to pandas. My question is: is it possible to make a class that can make a python protobuf object look like a dictionary? That is, remove step 2. Any references or pseudocode would be helpful.
With protocol buffers, you write a . proto description of the data structure you wish to store. From that, the protocol buffer compiler creates a class that implements automatic encoding and parsing of the protocol buffer data with an efficient binary format.
Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a network or for storing data.
We can convert a dictionary to a pandas dataframe by using the pd. DataFrame. from_dict() class-method.
You can convert a dictionary to Pandas Dataframe using df = pd. DataFrame. from_dict(my_dict) statement. In this tutorial, you'll learn the different methods available to convert python dict to Pandas dataframe.
You might want to check the ProtoText python package. It does provide in-place dict-like operation to access your protobuf object.
Example usage:
Assume you have a python protobuf object person_obj
.
import ProtoText
print person_obj['name'] # print out the person_obj.name
person_obj['name'] = 'David' # set the attribute 'name' to 'David'
# again set the attribute 'name' to 'David' but in batch mode
person_obj.update({'name': 'David'})
print ('name' in person_obj) # print whether the 'name' attribute is set in person_obj
# the 'in' operator is better than the google implementation HasField function
# in the sense that it won't raise Exception even if the field is not defined
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With