I just followed the tutorial to setup Nutch from NutchWiki.
Downloaded Nutch 2.x src and set all configurations.
The problem occurs when I just started to crawl.
When I run this code : bin/nutch inject crawl/crawldb urls I am getting an error message like this : Unrecognized arg urls
I just followed all steps in the tutorial, created directories, made changes to configuration files etc. And I also have a query that there is no crawldb directory in the apache-nutch-2.x/runtime/local/ Is it automatically generated or need to manually generate it ?
Any help to this problem will be appreciated.
I was going through the same problem. The documentation seems to be outdated. It is for 1.x .
For 2.x I have tried the following and it worked for me.
bin/nutch inject urls
Hope it helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With