require 'uri'
uri = URI.parse 'http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg'
The browsers have no problem with http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg so I'm asking myself if this ruby class is a little bit outdated? And should I completely renounce it or do some error handling…
The answer just came to me by asking myself the question:
begin
uri = URI.parse(url)
rescue URI::InvalidURIError
uri = URI.parse(URI.escape(url))
end
With kudus to all the URI.escape
answers (also known as URI.encode
), these methods have been officially made obsolete by Ruby 2.7 - i.e. they now produce a visible URI.escape is obsolete
warning message when you use them - previously they have just been deprecated. In Ruby 3.0 these methods have been completely removed and are no longer available at all - not even with a warning.
Unfortunately, as far as I can tell, the Ruby's standard library URI
class does not offer any alternative for handling URIs containing non-ASCII characters, which are all so common these days - <sarcasm>now that the web had gone international</sarcasm>.
The best solution I came up with is using the addressable gem that contains the URI
class we deserve - it handles everything the world has to throw at it and you can get an "HTTP safe" URI using the #display_uri
method:
Addressable::URI.parse("http://example.com/Оуэн-Мэтьюс.jpg")
=> #<Addressable::URI:0xc8 URI:http://example.com/Оуэн-Мэтьюс.jpg>
Addressable::URI.parse("http://example.com/Оуэн-Мэтьюс.jpg").display_uri.to_s
=> "http://example.com/%D0%9E%D1%83%D1%8D%D0%BD-%D0%9C%D1%8D%D1%82%D1%8C%D1%8E%D1%81.jpg"
Addressable::URI
also comes with all kinds of goodies, such as port inferral (you can tell whether the URL originally contained a port specification, or you can not care), and URL canonicalization (given a base URL, take a possibly relative URL and generate an absolute URL).
Here's how to use this with net/http
:
response = Net::HTTP.start(url.host, url.inferred_port,
:use_ssl => url.scheme == 'https') do |http|
req = Net::HTTP::Get.new(url.display_uri.request_uri)
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With