I am using Ruby on Rails v3.0.9 and I would like to retrieve the favicon.ico
image of each web site for which I set a link.
That is, if in my application I set the http://www.facebook.com/
URL I would like to retrieve the Facebook' icon and use\insert that in my web pages. Of course I would like to do that also for all other web sites.
How can I retrieve favicon.ico
icons from web sites in an "automatic" way (with "automatic" I mean to search for a favicon in a web site and get the link to it - I think no because not all web sites have a favicon named exactly 'favicon.ico'. I would like to recognize that in an "automatic" way)?
P.S.: What I would like to make is something like Facebook makes when to add a link\URL in your Facebook page: it recognizes the related web site logo and then appends that to the link\URL.
http://getfavicon.appspot.com/ works great for fetching favicons. Just give it the url for the site and you'll get the favicon back:
http://g.etfv.co/http://www.google.com
Recently I have written some similar solution.
If we want find favicon url, that can be not only .ico
file and can be not in the root, we should parse target site html.
In Ruby on Rails, I have used nokogiri gem for html parsing.
First we parse all meta tags where itemprop
attribute contains image
keyword. It is necessary in situations where target site used https://schema.org/WebPage template, that more modern technology than just link
tag.
If we found it, we can use content
attribute as favicon url. But we should check it for really URL existence, just to be sure.
If we can't found some meta tags, then we search for standard link
tags, where rel
attribute contains icon
keyword. This is W3C standard situation (https://www.w3.org/2005/10/howto-favicon)
And some code of my solution:
require 'open-uri'
def site_icon_link site
icon_link = nil
url = nil
doc = Nokogiri::HTML(open(site))
metas = doc.css("meta[itemprop*=image]")
if metas.any?
url = metas.first.attributes['content'].value
else
links = doc.css("link[rel*=icon]")
if links.any?
url = links.first.attributes['href'].value
end
end
if url =~ URI::regexp
icon_link = url
elsif (site + url) =~ URI::regexp
icon_link = site + url
end
icon_link
end
The favicons are being found by two ways. First, there is a 'hardcoded', traditional name of `http://example.com/favicon.ico'.
Second, the HTML pages may define the favicon in their <head>
sections, by <link rel="icon"...>
and a few other. (You may want to read the Wikipedia article about favicon)
So, your automat may fetch the main page of given website, parse it and check whether there are proper <link>
tags, and then, as a fallback, try the "hardcoded" favicon.ico
name.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With