Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

serialize a large array in PHP?

I am curious, is there a size limit on serialize in PHP. Would it be possible to serialize an array with 5,000 keys and values so it can be stored into a cache?

I am hoping to cache a users friend list on a social network site, the cache will need to be updated fairly often but it will need to be read almost every page load.

On a single server setup I am assuming APC would be better then memcache for this.

like image 916
JasonDavis Avatar asked Aug 10 '09 20:08

JasonDavis


People also ask

Can you serialize an array?

Serialize() converts an array, given as its only parameter, into a normal string that you can save in a file, pass in a URL, etc.

What is serialize array in PHP?

The serialize() function converts a storable representation of a value. To serialize data means to convert a value to a sequence of bits, so that it can be stored in a file, a memory buffer, or transmitted across a network.

How can I serialize data in PHP?

To get the POST values from serializeArray in PHP, use the serializeArray() method. The serializeArray( ) method serializes all forms and form elements like the . serialize() method but returns a JSON data structure for you to work with.

What is serialize vs deserialize?

Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.


1 Answers

As quite a couple other people answered already, just for fun, here's a very quick benchmark (do I dare calling it that ? ) ; consider the following code :

$num = 1;

$list = array_fill(0, 5000, str_repeat('1234567890', $num));

$before = microtime(true);
for ($i=0 ; $i<10000 ; $i++) {
    $str = serialize($list);
}
$after = microtime(true);

var_dump($after-$before);
var_dump(memory_get_peak_usage());

I'm running this on PHP 5.2.6 (the one bundled with Ubuntu jaunty).
And, yes, there are only values ; no keys ; and the values are quite simple : no object, no sub-array, no nothing but string.

For $num = 1, you get :

float(11.8147978783)
int(1702688)

For $num = 10, you get :

float(13.1230671406)
int(2612104)

And, for $num = 100, you get :

float(63.2925770283)
int(11621760)

So, it seems the bigger each element of the array is, the longer it takes (seems fair, actually). But, for elements 100 times bigger, you don't take 100 times much longer...


Now, with an array of 50000 elements, instead of 5000, which means this part of the code is changed :

$list = array_fill(0, 50000, str_repeat('1234567890', $num));

With $num = 1, you get :

float(158.236332178)
int(15750752)

Considering the time it took for 1, I won't be running this for either $num = 10 nor $num = 100...


Yes, of course, in a real situation, you wouldn't be doing this 10000 times ; so let's try with only 10 iterations of the for loop.

For $num = 1 :

float(0.206310987473)
int(15750752)

For $num = 10 :

float(0.272629022598)
int(24849832)

And for $num = 100 :

float(0.895547151566)
int(114949792)

Yeah, that's almost 1 second -- and quite a bit of memory used ^^
(No, this is not a production server : I have a pretty high memory_limit on this development machine ^^ )


So, in the end, to be a bit shorter than those number -- and, yes, you can have numbers say whatever you want them to -- I wouldn't say there is a "limit" as in "hardcoded" in PHP, but you'll end up facing one of those :

  • max_execution_time (generally, on a webserver, it's never more than 30 seconds)
  • memory_limit (on a webserver, it's generally not muco more than 32MB)
  • the load you webserver will have : while 1 of those big serialize-loop was running, it took 1 of my CPU ; if you are having quite a couple of users on the same page at the same time, I let you imagine what it will give ;-)
  • the patience of your user ^^

But, except if you are really serializing long arrays of big data, I am not sure it will matter that much...
And you must take into consideration the amount of time/CPU-load using that cache might help you gain ;-)

Still, the best way to know would be to test by yourself, with real data ;-)


And you might also want to take a look at what Xdebug can do when it comes to profiling : this kind of situation is one of those it is useful for!

like image 163
Pascal MARTIN Avatar answered Oct 20 '22 05:10

Pascal MARTIN