I'm using open-uri to open URLs.
resp = open("http://sub_domain.domain.com")
If it contains underscore I get an error:
URI::InvalidURIError: the scheme http does not accept registry part: sub_domain.domain.com (or bad hostname?)
I understand that this is because according to RFC URLs can contain only letters and numbers. Is there any workaround?
This looks like a bug in URI, and uri-open, HTTParty and many other gems make use of URI.parse.
Here's a workaround:
require 'net/http'
require 'open-uri'
def hopen(url)
begin
open(url)
rescue URI::InvalidURIError
host = url.match(".+\:\/\/([^\/]+)")[1]
path = url.partition(host)[2] || "/"
Net::HTTP.get host, path
end
end
resp = hopen("http://dear_raed.blogspot.com/2009_01_01_archive.html")
URI
has an old-fashioned idea of what an url looks like.
Lately I'm using addressable
to get around that:
require 'open-uri'
require 'addressable/uri'
class URI::Parser
def split url
a = Addressable::URI::parse url
[a.scheme, a.userinfo, a.host, a.port, nil, a.path, nil, a.query, a.fragment]
end
end
resp = open("http://sub_domain.domain.com") # Yay!
Don't forget to gem install addressable
This initializer in my rails app seems to make URI.parse work at least:
# config/initializers/uri_underscore.rb
class URI::Generic
def initialize_with_registry_check(scheme,
userinfo, host, port, registry,
path, opaque,
query,
fragment,
parser = DEFAULT_PARSER,
arg_check = false)
if %w(http https).include?(scheme) && host.nil? && registry =~ /_/
initialize_without_registry_check(scheme, userinfo, registry, port, nil, path, opaque, query, fragment, parser, arg_check)
else
initialize_without_registry_check(scheme, userinfo, host, port, registry, path, opaque, query, fragment, parser, arg_check)
end
end
alias_method_chain :initialize, :registry_check
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With