In the process of looking for solutions to help sanitise some output, I came across code that does the following.
preg_replace('|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\\x80-\\xff]|i', '', $some_url)
Now, I think it's basically trying to remove anything other than the above mentioned characters. But doesn't \\x80-\\xff
refer to some form of non-printable ascii characters ? If so, why would the code possibly be trying NOT to remove them ?
Any indications/pointers/help would be appreciated. Thanks.
x80
-xFF
are non-ASCII character ranges. They're still printable, both in Latin-1, or encode higher code points for UTF-8.
Using \\x80
over \x80
is slightly more correct. The backslash escapes itself in strings. In single quoted strings too, albeit it's effectively irrelevant there.
In double quoted strings however using just \x80
would be interpreted by PHP, whereas \\x80
would be seen and interpreted by the regex engine.
Okay, all the answers given so far lead me in the right direction and allowed me to find the following in the documentation.
After \x, up to two hexadecimal digits are read (letters can be in upper or lower case). In UTF-8 mode, \x{...} is allowed, where the contents of the braces is a string of hexadecimal digits. It is interpreted as a UTF-8 character whose code number is the given hexadecimal number. The original hexadecimal escape sequence, \xhh, matches a two-byte UTF-8 character if the value is greater than 127.
So, as a summary :-
i) '\x' allows for a hexadecimal escape sequence, after which, up to two hexadecimal digits are read
ii) '\xhh' the two 'hh' letters can be in upper or lower case
iii) '\xhh' specifies a code-point in the range 0-FF
iv) '\x80-\xFF' refers to a character range outside ASCII
You don't need to use double backslash in a pattern with PHP, however even if you use it, it is ignored and read as an escape (like a simple backslash).
One exception, if you use the heredoc or nowdoc syntax to enclose the pattern, a double backslash is seen as a literal backslash.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With