Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C equivalent to python pickle (object serialization)?

What would be the C equivalent to this python code? Thanks.

data = gather_me_some_data()
# where data = [ (metic, datapoints), ... ]
# and datapoints = [ (timestamp, value), ... ]

serialized_data = cPickle.dumps(data, protocol=-1)
length_prefix = struct.pack("!L", len(serialized_data))
message = length_prefix + serialized_data
like image 908
Bill Avatar asked Apr 04 '12 03:04

Bill


People also ask

What can I use instead of a pickle in Python?

An alternative is cPickle. It is nearly identical to pickle , but written in C, which makes it up to 1000 times faster. For small files, however, you won't notice the difference in speed. Both produce the same data streams, which means that Pickle and cPickle can use the same files.

What is object serialization in Python?

Serialization refers to the process of converting a data object (e.g., Python objects, Tensorflow models) into a format that allows us to store or transmit the data and then recreate the object when needed using the reverse process of deserialization.

Is serialization and pickling same?

Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”; however, to avoid confusion, the terms used here are “pickling” and “unpickling”.

Can you pickle an object Python?

Python comes with a built-in package, known as pickle , that can be used to perform pickling and unpickling operations. Pickling and unpickling in Python is the process that is used to describe the conversion of objects into byte streams and vice versa - serialization and deserialization, using Python's pickle module.


2 Answers

C doesn't supports direct serialization mechanism because in C you can't get type information at run-time. You must yourself inject some type info at run-time and then construct required object by that type info. So define all your possible structs:

typedef struct {
  int myInt;
  float myFloat;
  unsigned char myData[MY_DATA_SIZE];
} MyStruct_1;

typedef struct {
  unsigned char myUnsignedChar;
  double myDouble;
} MyStruct_2;

Then define enum which collects info about what structs in total you have:

typedef enum {
  ST_MYSTRUCT_1,
  ST_MYSTRUCT_2
} MyStructType;

Define helper function which lets to determine any struct size:

int GetStructSize(MyStructType structType) {
      switch (structType) {
          case ST_MYSTRUCT_1:
              return sizeof(MyStruct_1);
          case ST_MYSTRUCT_2:
              return sizeof(MyStruct_2);
          default:
              // OOPS no such struct in our pocket
              return 0;
      }
}

Then define serialize function:

void BinarySerialize(
    MyStructType structType,
    void * structPointer,
    unsigned char * serializedData) {

  int structSize = GetStructSize(structType);

  if (structSize != 0) {
    // copy struct metadata to serialized bytes
    memcpy(serializedData, &structType, sizeof(structType));
    // copy struct itself
    memcpy(serializedData+sizeof(structType), structPointer, structSize);
  }
}

And de-serialization function:

void BinaryDeserialize(
    MyStructType structTypeDestination,
    void ** structPointer,
    unsigned char * serializedData)
{
    // get source struct type
    MyStructType structTypeSource;
    memcpy(&structTypeSource, serializedData, sizeof(structTypeSource));

    // get source struct size
    int structSize = GetStructSize(structTypeSource);

    if (structTypeSource == structTypeDestination && structSize != 0) {
      *structPointer = malloc(structSize);
      memcpy(*structPointer, serializedData+sizeof(structTypeSource), structSize);
    }
}

Serialization usage example:

MyStruct_2 structInput = {0x69, 0.1};
MyStruct_1 * structOutput_1 = NULL;
MyStruct_2 * structOutput_2 = NULL;
unsigned char testSerializedData[SERIALIZED_DATA_MAX_SIZE] = {0};

// serialize structInput
BinarySerialize(ST_MYSTRUCT_2, &structInput, testSerializedData);
// try to de-serialize to something
BinaryDeserialize(ST_MYSTRUCT_1, &structOutput_1, testSerializedData);
BinaryDeserialize(ST_MYSTRUCT_2, &structOutput_2, testSerializedData);
// determine which object was de-serialized
// (plus you will get code-completion support about object members from IDE)
if (structOutput_1 != NULL) {
   // do something with structOutput_1 
   free(structOutput_1);
}
else if (structOutput_2 != NULL) {
   // do something with structOutput_2
   free(structOutput_2);
}

I think this is most simple serialization approach in C. But it has some problems:

  • struct must not have pointers, because you will never know how much memory one needs to allocate when serializing pointers and from where/how to serialize data into pointers.
  • this example has issues with system endianess - you need to be careful about how data is stored in memory - in big-endian or little-endian fashion and reverse bytes if needed [when casting char * to integal type such as enum] (...or refactor code to be more portable).
like image 194
Agnius Vasiliauskas Avatar answered Sep 17 '22 15:09

Agnius Vasiliauskas


If you can use C++, there is the PicklingTools library

like image 32
Janne Karila Avatar answered Sep 17 '22 15:09

Janne Karila