Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

robots.txt parser java

I want to know how to parse the robots.txt in java.

Is there already any code?

like image 313
zahir hussain Avatar asked Jun 29 '10 13:06

zahir hussain


1 Answers

Heritrix is an open-source web crawler written in Java. Looking through their javadoc, I see that they have a utility class Robotstxt for parsing the robots.txt file.

like image 122
Bill the Lizard Avatar answered Sep 19 '22 18:09

Bill the Lizard