Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove "www", "http://" from string

How can I remove "www", "http://", "https://" from strings using Ruby?

I tried this but it didn't work:

s.gsub('/(?:http?:\/\/)?(?:www\.)?(.*)\/?$/i', '')

Here what I'm doing in Rails:

<%= auto_link(job.description) do |url| url.truncate(25).gsub('http://', '') end %>

Url are truncated, but my goal is to remove the beginning of the links, such as "www" or "http://" so the link would look like "google.com/somepage/d...", not like "http://google.com/some..."

like image 886
Martin Petrov Avatar asked Apr 20 '12 17:04

Martin Petrov


People also ask

How do I uninstall HTTP?

To remove http:// or https:// from a url, call the replace() method with the following regular expression - /^https?:\/\// and an empty string as parameters. The replace method will return a new string, where the http:// part is removed. Copied!


3 Answers

This method should catch all 3 variations:

def strip_url(url)
  url.sub!(/https\:\/\/www./, '') if url.include? "https://www."

  url.sub!(/http\:\/\/www./, '')  if url.include? "http://www."

  url.sub!(/www./, '')            if url.include? "www."

  return url
end

strip_url("http://www.google.com")
   => "google.com" 
strip_url("https://www.facebook.com")
   => "facebook.com" 
strip_url("www.stackoverflow.com")
  => "stackoverflow.com" 
like image 100
Josh Avatar answered Oct 31 '22 10:10

Josh


def strip_url(target_url)
  target_url.gsub("http://", "")
            .gsub("https://", "")
            .gsub("www.", "")
end

strip_url("http://www.google.com")
 => "google.com" 
strip_url("https://www.google.com")
 => "google.com" 
strip_url("http://google.com")
 => "google.com"
strip_url("https://google.com")
 => "google.com" 
strip_url("www.google.com")
 => "google.com" 
like image 33
rusllonrails Avatar answered Oct 31 '22 09:10

rusllonrails


s = s.sub(/^https?\:\/\//, '').sub(/^www./,'')

If you don't want to use s =, you should use sub!s instead of all subs.

The problems with your code are:

  1. Question mark always follows AFTER an optional character
  2. Always replace one pattern in a sub. You can "chain up" multiple operations.
  3. Use sub instead of gsub and ^ in the beginning of Regexp so it only replaces the http:// in the beginning but leaves the ones in the middle.
like image 44
SwiftMango Avatar answered Oct 31 '22 09:10

SwiftMango