Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: array of objects - serialize vs json_encode - alternatives?

In PHP I'm having a real hard time using serialize/unserialize on a large array of objects (100000+ objects). These objects can be of a lot of different types, but are all descendants from a base class.

Somehow when I use unserialize on the array of objects about 0,001% of the objects are generated wrong! A whole different object is generated instead. This does not happen random, but each time with the same objects. But if I change the order of the array, it happens with different objects, so this looks like a bug to me.

I switched to json_encode/json_decode, but found that this always uses stdClass as the object's class. I solved this by including the classname of each object as a property, and then use this property to construct a the new object, but this solution is not very elegant.

Using var_export with eval works fine but is about 3 times slower than the other methods and uses much more memory.

Now my questions are :

  • what could cause the bug / wrong objects that are created with unserialize ?
  • is there a better way to use json_decode with an array of objects, so that classes are somehow stored within the json automatically?
  • is there maybe even an other method to read/write a large array of objects in PHP?

UPDATE

I'm beginning to believe there must be something strange with my array-data, because with msgpack_serialize (php extension, alternative to serialize) I get the same kind of errors (but strangely enough not the same objects are generated wrong!).

UPDATE 2

Found a solution : instead of doing serialize on the entire array, I do it on each object now, first serialize and then base64_encode and then I store each serialized object as a separate line in a text-file. This way I can generate the entire array of objects and then iterate each object using file() with unserialize and base64_decode : no more errors!

like image 904
Dylan Avatar asked Jul 05 '13 12:07

Dylan


1 Answers

With serialize/unserialize functions 2 magic methods are connected.

__sleep()

serialize() checks if your class has a function with the magic name __sleep(). If so, that function is executed prior to any serialization. It can clean up the object and is supposed to return an array with the names of all variables of that object that should be serialized. If the method doesn't return anything then NULL is serialized and E_NOTICE is issued.

With sleep you have better control on serializaction you can pass the variables that can be serialized and clean resources befere seralization.

When unserialize is called then the other function should be mentioned

__wakeup()

The intended use of __wakeup() is to reestablish any database connections that may have been lost during serialization and perform other reinitialization tasks.

About json_encode()

  1. It doesn't have magic methods __wakeup, __sleep so you have less control
  2. It doesn't serialize private properties
  3. Objects are always stored as stdClass
  4. Json_encode is faster than serialize

It's is up to you what you choose but for more advanced classes with db connection etc. I would suggest serialize()

like image 199
Robert Avatar answered Oct 13 '22 07:10

Robert