Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

json_encode serialize null bytes

Tags:

json

php

I ran into this serialize gotcha today. From PHP.net doc:

Note: Object's private members have the class name prepended to the member name; protected members have a '*' prepended to the member name. These prepended values have null bytes on either side.

I'm using debug_backtrace to generate a trace for a debug report, which gets json_encoded. Internally it uses the serializer to generate the data for the trace.

This is the (partial) output of the json_encode:

{"\u0000MyObject\u0000my_var":[]}

The problem is that json_decode can't handle this, it will complain about the null bytes.

So json_encode happily writes the null bytes, that json_decode can't decode. This seems a little wonky to me. I would expect json_encode takes care of the necessary escaping, or at least that json_decode can parse anything produced by json_encode, but this doesn't seem to be the case.

I guess I have several solutions:

  • Strip the null bytes from the trace, I'm not so much interested in unserializing the object, I just want a string representation.
  • Strip private variables all together from the trace.
  • Fix json_encode so that it doesn't produce null bytes
  • Fix json_decode so that it accepts null bytes

Did anyone run into this problem and how did you fix it?


Sample:

<?php
class MyClass {
    public $mypublic = 1;
    private $myprivate = 2;
    public function myfunc() {
        return debug_backtrace();
    }
}
$c = new MyClass();
$json = json_encode(call_user_func_array(array($c, "myfunc"), new MyClass()));
echo $json;
echo json_decode($json); // <-- Fatal error: Cannot access property started with '\0' in test.php on line 12

Solution

Since PHP 5.3 call_user_func_array will throw a warning when the second parameter of call_user_func_array is not an array. Until then you'll have to check it yourself.

like image 722
Halcyon Avatar asked Jul 18 '11 09:07

Halcyon


1 Answers

(sorry, this might be better off as a comment, as it doesn't exactly answer your question -- but it's a bit too long for that)


I've tried reproducing what you describe, with both PHP 5.3 and 5.2, and here's what I get :

First, let's create a class with a private property and instanciate it :

class A {
    public $pub = 10;
    private $priv = 20;
}

$a = new A();
var_dump($a);

Which gets me :

object(A)[1]
  public 'pub' => int 10
  private 'priv' => int 20


Now, if I serialize() my object :

$serialized = serialize($a);
var_dump($serialized);

I get :

string 'O:1:"A":2:{s:3:"pub";i:10;s:7:"�A�priv";i:20;}' (length=46)

Which is pretty much what you describe : there are those null-bytes arround the private-property's name.


And let's continue with json_encode() :

$jsoned = json_encode($serialized);
var_dump($jsoned);

Which gives me, like you said, a string with some \u0000 :

string '"O:1:\"A\":2:{s:3:\"pub\";i:10;s:7:\"\u0000A\u0000priv\";i:20;}"' (length=64)


Now, if I try to json_decode() this string :

$unjsoned = json_decode($jsoned);
var_dump($unjsoned);

Here's what I get :

string 'O:1:"A":2:{s:3:"pub";i:10;s:7:"�A�priv";i:20;}' (length=46)

=> The null-bytes don't seem to be lost : they are properly re-created from the JSON string.


And, calling unserialize() on that :

$unserialized = unserialize($unjsoned);
var_dump($unserialized);

I get back the initial object I had :

object(A)[2]
  public 'pub' => int 10
  private 'priv' => int 20

So, I don't seem to reproduce your problem when serializing+encoding and the de-encoding+unserializing...

I should add that I have not been able to find anything about such a bug, in both :

  • php's bug-tracker,
  • and the SVN history of the json extension.



Now, if I try with a more complex object, with a class that contains a private member, which is itself an object which contains a private property :

class A {
    private $priv;
    public function __construct() {
        $this->priv = new B();
    }
}

class B {
    private $b = 10;
}


I get exactly the same kind of behavior : everything works just fine -- and here is the output I get, when using exactly the same actions and var_dump() calls as before :

object(A)[1]
  private 'priv' => 
    object(B)[2]
      private 'b' => int 10

string 'O:1:"A":1:{s:7:"�A�priv";O:1:"B":1:{s:4:"�B�b";i:10;}}' (length=54)

string '"O:1:\"A\":1:{s:7:\"\u0000A\u0000priv\";O:1:\"B\":1:{s:4:\"\u0000B\u0000b\";i:10;}}"' (length=84)

string 'O:1:"A":1:{s:7:"�A�priv";O:1:"B":1:{s:4:"�B�b";i:10;}}' (length=54)

object(A)[3]
  private 'priv' => 
    object(B)[4]
      private 'b' => int 10

Here too, I cannot reproduce the problem you describe.



Still, if I try this :

var_dump( 
    unserialize( 
        json_decode('{"\u0000MyObject\u0000my_var":[]}')
    )
);

I indeed get into troubles :

Fatal error: Cannot access property started with '\0'

But, thinking about it, if I try to decode it "myself", I don't really see how you'd have gotten such a JSON string...

Are you sure there is not a problem somewhere else ? Like in the encoding process ?

like image 187
Pascal MARTIN Avatar answered Oct 30 '22 13:10

Pascal MARTIN