Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Google recognizes adult content with safesearch?

I am creating a search engine ( for studying ) and I want to know how Google recognizes adult content and images with Safesearch ( http://en.wikipedia.org/wiki/Safesearch ).

The program language doesn't matter, I want to know only the approach for a generic program language.

like image 796
xRobot Avatar asked Jan 02 '11 00:01

xRobot


1 Answers

If the rules for any sort of content filter fell into the hands of people trying to get that content through the filter, the filter would become ineffective.

So I imagine that Google's rules (1) are not publicly available and (2) change frequently.

That said, starting with a small blacklist of adult sites and following outgoing links (and/or finding sites with links to the blacklisted sites) probably finds a huge number of adult sites. But by no means all, you'd want some sort of text processing and image recognition algorithms in addition.

NOTE: A popular theory is that adult content providers pay people to ask questions on stackoverflow.com so that Jon Skeet and Marc Gravell will have less time to update the SafeSearch filters. However, it is easily shown that Jon and Marc answer questions at such a high rate that any such strategy would not be economically viable.

like image 99
Ben Voigt Avatar answered Nov 05 '22 17:11

Ben Voigt