There is a site that I want to retrieve from Google Cache that had thousands of pages. Is there any way I can get it back quickly using Google Cache or some other web crawler/archiver?
Not all pages will be cached by Google. Pages that rely on Javascript might not get cached, and if they do they could be blank if no HTML loaded while the snapshot was taken. Google is not going to cache every page it crawls, so some pages you check might not have any cached version.
Via search results To the right of the URL, you'll see three dots in a vertical line. Click that button to open a pop-up on the page. In the pop-up, you'll see a button at the very bottom that says “Cached.” Click that button to view the most recent cached version of the page.
I created a free service to recover your website which can retrieve most pages from the search engines cache.
The output of the service is a zipped file with your HTML from the search engines cache. It is still in beta so it still needs a lot of tweaks and bugfixes, but hopefully it can help you or other people who experience the same problem.
UPDATE: I didn't have time to continue the development of the service so it is closed.
You can see what Google (still) knows about a website by using a site
restrict:
http://www.google.com/search?q=site:[domain]
You might also check out the Internet Archive.
(In either case, you’d probably want to do some heavy-duty automating to fetch thousands of pages.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With