I have user submitted tags that can be any type of (valid) UTF-8 string. I want to know if it is safe to include them in the URL merly by running them through urlencode()
.
In other words, is urlencode() safe to use for valid UTF-8 strings? (by valid I mean id have already force-encoded them to UTF-8)
urlencode
does not depend on a specific character encoding. It just looks at the bytes, interprets them as ASCII characters and replaces any byte that is either not allowed in ASCII (0x80–0xFF) or not allowed in plain in a URL.
Now to your question: Yes, using urlencode
does encode any string in any character encoding to be safely used – but only in the URL query! Because urlencode
formats the input according to application/x-www-form-urlencoded that differs from the “normal” percent encoding in how the space is encoded: In application/x-www-form-urlencoded spaces are replaced by +
while the “normal” percent encoding replaces them by %20
.
If you want to “normal” percent encoding use rawurlencode
instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With