Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP unserialize fails with non-encoded characters?

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}'; // fails $ser2 = 'a:2:{i:0;s:5:"hello";i:1;s:5:"world";}'; // works $out = unserialize($ser); $out2 = unserialize($ser2); print_r($out); print_r($out2); echo "<hr>"; 

But why?
Should I encode before serialzing than? How?

I am using Javascript to write the serialized string to a hidden field, than PHP's $_POST
In JS I have something like:

function writeImgData() {     var caption_arr = new Array();     $('.album img').each(function(index) {          caption_arr.push($(this).attr('alt'));     });     $("#hidden-field").attr("value", serializeArray(caption_arr)); }; 
like image 736
FFish Avatar asked May 17 '10 22:05

FFish


1 Answers

The reason why unserialize() fails with:

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}'; 

Is because the length for héllö and wörld are wrong, since PHP doesn't correctly handle multi-byte strings natively:

echo strlen('héllö'); // 7 echo strlen('wörld'); // 6 

However if you try to unserialize() the following correct string:

$ser = 'a:2:{i:0;s:7:"héllö";i:1;s:6:"wörld";}';  echo '<pre>'; print_r(unserialize($ser)); echo '</pre>'; 

It works:

Array (     [0] => héllö     [1] => wörld ) 

If you use PHP serialize() it should correctly compute the lengths of multi-byte string indexes.

On the other hand, if you want to work with serialized data in multiple (programming) languages you should forget it and move to something like JSON, which is way more standardized.

like image 148
Alix Axel Avatar answered Sep 21 '22 23:09

Alix Axel