Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protecting email addresses from spam bots / web crawlers

How do you prevent emails being gathered from web pages by email spiders? Does mailto: linking them increase the likelihood of them being picked up? Is URL-encoding useful?

Obviously the best counter-measure is to only show email addresses to logged-in users, or to provide a contact form instead of an email address. But in terms of purely client-side solutions, what is available?

like image 651
Zaz Avatar asked Sep 08 '10 01:09

Zaz


People also ask

Can bots do web crawling?

Because it is not possible to know how many total webpages there are on the Internet, web crawler bots start from a seed, or a list of known URLs. They crawl the webpages at those URLs first. As they crawl those webpages, they will find hyperlinks to other URLs, and they add those to the list of pages to crawl next.


1 Answers

Most email spiders don't have javascript interpreters, so if you really need the mailto: you can inject it with javascript... just make sure the address is obscured in the javascript somehow, e.g.

myLink.href='mai'+'lto:'+'bob'
           +'@'
           +'example.com';

If you need to display the email address on the page, a common solution is to generate an image using something like php's gd (although the javascript injection should work ok for this too).

The idea is to remove the email addresses from the HTML and inject them with javascript. That way the email address doesn't appear in its original form in any of the HTTP traffic, which is what the spider is looking at.

like image 184
Dagg Nabbit Avatar answered Oct 08 '22 09:10

Dagg Nabbit