Where could I find comprehensive list of Crawler or Spider IP address. I need IPs from google yahoo microsoft and other search engines that regularly crawl my sites.
I do not want to disable them so keep robots.txt file out of the answers. The list is for filter that is doing statistical reporting on activity on each page.
Please post links to good sources that could be used. Paid or free.
Alternatively, you can identify Googlebot by IP address by matching the crawler's IP address to the list of Googlebot IP addresses. For other Google IP addresses from where your site may be accessed (for example, by user request or Apps Scripts), match the accessing IP address against the list of Google IP addresses.
"Crawler" (sometimes also called a "robot" or "spider") is a generic term for any program that is used to automatically discover and scan websites by following links from one webpage to another. Google's main crawler is called Googlebot.
Your web server logs. I believe they're free.
You probably don't want to do this by IP address. Most crawlers send a unique user agent string when they crawl your site and it's much more likely you want to use that to identify them. I don't know where you can find a good list of those though
EDIT: Actually this page I found with google seems to both answer your question a bit, and also give the user agents (which is still more likely a better approach)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With