For some reason, after submitting a string like this Jack’s Spindle
from a text form to php, I get:
Jack%u2019s Spindle
This is not what PHP's urlencode()
would do, which would be Jack%92s+Spindle
or rawurlencode()
= Jack%92s%20Spindle
Thus, urldecode()
and the raw version don't work to decode that string... Is there another function for such strings?
--
Also, Jack’s Spindle
would be the HTML-safe way to encode the above, but urlencode()
and raw* for that yields: Jack%26%238217%3Bs+Spindle
and Jack%26%238217%3Bs%20Spindle
respectively...
Where is the %u2019
coming from? What does it represent? How do you get it back to just that innoculous apostrophe?
Well, only you can tell us from where that came from. From are you getting your text and which transformations is it being submitted to? I confess I haven't seen that encoding strategy yet.
That said, it's very similar to the way Javascript encodes UTF-16 code units: \uXXXX
where each X
represents a hexadecimal character. To convert it to HTML entities, you could do:
preg_replace('/%u([a-fA-F0-9]{4})/', '&#x\\1;', $string)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With