I'm trying to mimic the json_encode
bitmask flags implemented in PHP 5.3.0, here is the string I have:
$s = addslashes('O\'Rei"lly'); // O\'Rei\"lly
Doing json_encode($s, JSON_HEX_APOS | JSON_HEX_QUOT)
outputs the following:
"O\\\u0027Rei\\\u0022lly"
And I'm currently doing this in PHP versions older than 5.3.0:
str_replace(array('\\"', "\\'"), array('\\u0022', '\\\u0027'), json_encode($s))
or
str_replace(array('\\"', '\\\''), array('\\u0022', '\\\u0027'), json_encode($s))
Which correctly outputs the same result:
"O\\\u0027Rei\\\u0022lly"
I'm having trouble understanding why do I need to replace single quotes ('\\\''
or even "\\'"
[surrounding quotes excluded]) with '\\\u0027'
and not just '\\u0027'
.
Here is the code that I'm having trouble porting to PHP < 5.3:
if (get_magic_quotes_gpc() && version_compare(PHP_VERSION, '6.0.0', '<'))
{
/* JSON_HEX_APOS and JSON_HEX_QUOT are availiable */
if (version_compare(PHP_VERSION, '5.3.0', '>=') === true)
{
$_GET = json_encode($_GET, JSON_HEX_APOS | JSON_HEX_QUOT);
$_POST = json_encode($_POST, JSON_HEX_APOS | JSON_HEX_QUOT);
$_COOKIE = json_encode($_COOKIE, JSON_HEX_APOS | JSON_HEX_QUOT);
$_REQUEST = json_encode($_REQUEST, JSON_HEX_APOS | JSON_HEX_QUOT);
}
/* mimic the behaviour of JSON_HEX_APOS and JSON_HEX_QUOT */
else if (extension_loaded('json') === true)
{
$_GET = str_replace(array(), array('\\u0022', '\\u0027'), json_encode($_GET));
$_POST = str_replace(array(), array('\\u0022', '\\u0027'), json_encode($_POST));
$_COOKIE = str_replace(array(), array('\\u0022', '\\u0027'), json_encode($_COOKIE));
$_REQUEST = str_replace(array(), array('\\u0022', '\\u0027'), json_encode($_REQUEST));
}
$_GET = json_decode(stripslashes($_GET));
$_POST = json_decode(stripslashes($_POST));
$_COOKIE = json_decode(stripslashes($_COOKIE));
$_REQUEST = json_decode(stripslashes($_REQUEST));
}
We use escape characters to perform some specific task. The total number of escape sequences or escape characters in Java is 8. Each escape character is a valid character literal.
Escape sequences are used to signal an alternative interpretation of a series of characters. In Java, a character preceded by a backslash (\) is an escape sequence. The Java compiler takes an escape sequence as one single character that has a special meaning.
The PHP string
'O\'Rei"lly'
is just PHP's way of getting the literal value
O'Rei"lly
into a string which can be used. Calling addslashes
on that string changes it to be literally the following 11 characters
O\'Rei\"lly
i.e. strlen(addslashes('O\'Rei"lly')) == 11
This is the value which is being sent to json_escape
.
In JSON backslash is an escape character, so that needs to be escaped, i.e.
\
to be \\
Also single and double quotes can cause problems. So converting them to their unicode equivalent in one way to avoid problems. So later verions of PHP's json_encode change
'
to be \u0027
and
"
to be \u0022
So applying these three rules to
O\'Rei\"lly
gives us
O\\\u0027Rei\\\u0022lly
This string is then wrapped in double quotes to make it a JSON string. Your replace expressions include the leading forward slashes. Either by accident or on purpose this means that the leading and trailing double quote returned by json_encode
is not subject to the escaping, which it shouldn't be.
So in earlier versions of PHP
$s = addslashes('O\'Rei"lly');
print json_encode($s);
would print
"O\\'Rei\\\"lly"
and we want to change '
to be \u0027
and we want to change \"
to be \u0022
because the \
in \"
is just to get the "
into the string because it begins and ends with double-quotes.
So that's why we get
"O\\\u0027Rei\\\u0022lly"
It's escaping the backslash as well as the quote. It's difficult dealing with escaped escapes, as you're doing here, as it quickly turns into backslash counting games. :-/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With