An alternative web crawler to Nutch [closed]

2 Answers

Scrapy is a python library that crawls web sites. It is fairly small (compared to Nutch) and designed for limited site crawls. It has a Django type MVC style that I found pretty easy to customize.

answered Oct 04 '22 13:10

nate c

For the crawling part, I really like anemone and crawler4j. They both allow you to add your custom logic for links selection and page handling. For each page that you decide to keep, you can easily add the call to Solr.

answered Oct 04 '22 12:10

Pascal Dimassimo

Related questions
                            
                                Use of typename keyword with typedef and new
                            
                                Are there any fairly mature Lisp/Scheme/Clojure compilers for .Net CLR?
                            
                                how to dismiss a modal view controller presented as "Form Sheet" when a touch occur outside the form sheet?
                            
                                How to hint to Visual C++ compiler optimizer that a specific branch of an if-statement is unlikely to be executed?
                            
                                How to query a constexpr std::tuple at compile time?
                            
                                Android WebView for Facebook Like Button
                            
                                Start Visual Studio solution with a project unloaded
                            
                                In JavaScript's Underscore.js library what does 'context' mean and how do I use it? [duplicate]
                            
                                AutoResetEvent Reset method
                            
                                Encrypt/Decrypt using Bouncy Castle in C#
                            
                                SQL Server view or table-valued function? [duplicate]
                            
                                How to get scroll position from GridView?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

An alternative web crawler to Nutch [closed]

Tags:

wassimans

People also ask

2 Answers

nate c

Pascal Dimassimo

Recent Activity

Donate For Us