Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I set scheme to a URI object in Ruby

Tags:

url

uri

ruby

I'm trying to parse a URI from user input. I'm assuming some users won't put the scheme in their URI's and I want to default to "http".

The following code doesn't work:

require 'uri'   

uri_to_check = URI::parse("www.google.com")
uri_to_check.scheme = "http" unless uri_to_check.scheme

puts uri_to_check.to_s

I expect to see "http://www.google.com" but I get "http:www.google.com". Is it even possible to do it this way?

If so, what am I missing?

Is there a better way to do this?

like image 409
maz Avatar asked Oct 02 '10 12:10

maz


2 Answers

The leading slashes (//) indicate that the URL is an IP-based address, and are needed to flag the hostname so URI can parse them correctly.

Wikipedia has some good overviews and examples of use:

http://en.wikipedia.org/wiki/Url , http://en.wikipedia.org/wiki/URI_scheme , http://en.wikipedia.org/wiki/URL_normalization

The best information is in the spec itself: http://www.ietf.org/rfc/rfc1738.txt particularly in section 3.1 "3.1. Common Internet Scheme Syntax".

You might want to consider using the Addressable gem. It's smarter and is what I use when I need to do a lot of URI parsing or manipulation.

http://addressable.rubyforge.org/ and http://addressable.rubyforge.org/api/Addressable/URI.html

like image 57
the Tin Man Avatar answered Oct 21 '22 11:10

the Tin Man


When the string you want to be parsed doesn't conatin a scheme, URI doesn't recognize it as a hostname:

irb(main):001:0> require 'uri'
=> true
irb(main):002:0> uri = URI::parse("www.google.com")
=> #<URI::Generic:0x11cfc88 URL:www.google.com>
irb(main):003:0> uri.path
=> "www.google.com"
irb(main):004:0> uri.host
=> nil

When you set the scheme as you do in your example and then call to_s the URI is build without the host...

You can try something like the following: (That's a quick hack, I don't know the URI details...)

uri = URI::parse("www.google.com")
if uri.scheme.nil? && uri.host.nil?
  unless uri.path.nil?
    uri.scheme = "http"
    uri.host = uri.path
    uri.path = ""
  end
end

puts uri.to_s
like image 21
Scolytus Avatar answered Oct 21 '22 10:10

Scolytus