I'm using URI.encode
to generate HTML data URLs:
visit "data:text/html,#{URI::encode(html)}"
After upgrading to Ruby 2.7.1, interpreter started warning:
warning: URI.escape is obsolete
Recommended replacements of this are CGI.escape
and URI.encode_www_form_component
. However, they're not doing same thing:
2.7.1 :007 > URI.escape '<html>this and that</html>'
(irb):7: warning: URI.escape is obsolete
=> "%3Chtml%3Ethis%20and%20that%3C/html%3E"
2.7.1 :008 > CGI.escape '<html>this and that</html>'
=> "%3Chtml%3Ethis+and+that%3C%2Fhtml%3E"
2.7.1 :009 > URI.encode_www_form_component '<html>this and that</html>'
=> "%3Chtml%3Ethis+and+that%3C%2Fhtml%3E"
Result of these slight encoding differences - html page where spaces are replaced by +
. My question is - what's a good replacement of URI.encode
for this use case?
URI::escape is good for escaping a url which was not escaped properly. For example some websites output wrong/unescaped url in their anchor tag. If your program use these urls to fetch more resources, OpenURI will complain that the urls are invalid.
If you need to encode query strings then CGI. escape method is probably what you're looking for. CGI. escape follows the CGI/HTML forms spec and returns a string like application/x-www-form-urlencoded requests, which requires spaces to be escaped to + and encode most special characters.
There is actually a drop in replacement.
s = '<html>this and that</html>'
p = URI::Parser.new
p.escape(s)
=> "%3Chtml%3Ethis%20and%20that%3C/html%3E"
Docs: https://docs.w3cub.com/ruby~3/uri/rfc2396_parser
Found this through a comment under this article https://docs.knapsackpro.com/2020/uri-escape-is-obsolete-percent-encoding-your-query-string
Also tested this against some other strings in my setup, this also seems to retain commas the same way URI.escape
does, in contrast to ERB::Util.url_encode
.
NOTE:
As this answer became so popular now, it's probably worth to mention that you should not blindly change your code to use URI::Parser
unless you are certain your project doesn't need a standards compliant encoder. As URI.escape
was actually deprecated for a reason. So before simply switching to URI::Parser
make sure you have read and understood https://stackoverflow.com/a/13059657/6376353
There is no official RFC 3986-compliant URI escaper in the Ruby standard library today.
See Why is URI.escape() marked as obsolete and where is this REGEXP::UNSAFE constant? for background.
There are several methods that have various issues with them as you have discovered and pointed out in the comment:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With