Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retrieving an entire website using Google Cache?

There is a site that I want to retrieve from Google Cache that had thousands of pages. Is there any way I can get it back quickly using Google Cache or some other web crawler/archiver?

like image 548
stockoverflow Avatar asked Jul 29 '10 06:07

stockoverflow


People also ask

Does Google cache all websites?

Not all pages will be cached by Google. Pages that rely on Javascript might not get cached, and if they do they could be blank if no HTML loaded while the snapshot was taken. Google is not going to cache every page it crawls, so some pages you check might not have any cached version.

How do I go further back on Google cache?

Via search results To the right of the URL, you'll see three dots in a vertical line. Click that button to open a pop-up on the page. In the pop-up, you'll see a button at the very bottom that says “Cached.” Click that button to view the most recent cached version of the page.


2 Answers

I created a free service to recover your website which can retrieve most pages from the search engines cache.

The output of the service is a zipped file with your HTML from the search engines cache. It is still in beta so it still needs a lot of tweaks and bugfixes, but hopefully it can help you or other people who experience the same problem.

UPDATE: I didn't have time to continue the development of the service so it is closed.

like image 182
Dofs Avatar answered Oct 23 '22 03:10

Dofs


You can see what Google (still) knows about a website by using a site restrict:

http://www.google.com/search?q=site:[domain]

You might also check out the Internet Archive.

(In either case, you’d probably want to do some heavy-duty automating to fetch thousands of pages.)

like image 41
user413588 Avatar answered Oct 23 '22 02:10

user413588