Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Any API to search Google Cache?

I am trying to search within Google Cache, so I need to fire this query:

http://webcache.googleusercontent.com/search?q=cache:news.ycombinator.com/news+hacker+news

And get some content like timestamp from the page. But when I do this using curl (ruby), it gives a permission denied error, i.e. scraping is blocked and that was expected.

So, is there any way to search google cache (either an API or some kind of workaround scraping) and extract some information like timestamp?

like image 343
zengr Avatar asked Oct 23 '10 04:10

zengr


People also ask

How do I search Google by date cache?

Google Search Click that, and a menu will appear with a single option: “Cached.” Click that link to see a cached version of the page. You'll see a banner at the top with the date and time the snapshot was taken and a link to access the current page. Another simple method is to type “cache:URL” in the search bar.

How do I check my API cache?

Detecting the Cache API In modern browsers, each origin has a cache storage and we can inspect it by opening the browser developer tools: On Chrome: Application > Cache > Cache Storage.

Does Google cache search results?

The cache: operator is a search operator that you can use to find the cached version of a page. Google generates a cached version so that users can still access the web page, for example, if the site isn't available. The cache: operator is only available on web search.


1 Answers

I didn't get any API but I can scrape it using hpricot or nokogiri in rails (curl in Rails gives permission denied error). I will put up the code once I figure out how to extract the time stamp from the above URL using these gems.

Any one has a better solution?

like image 192
zengr Avatar answered Sep 27 '22 22:09

zengr