Have you indexed nutch crawl results using elasticsearch before?

2 Answers

I wrote an ElasticSearch plugin that mocks the Solr api. Using this plugin and the standard Nutch Solr indexer you can easily send crawled data into ElasticSearch. Plugin and an example of how to use it with Nutch can be found on GitHub:

https://github.com/mattweber/elasticsearch-mocksolrplugin

answered Oct 02 '22 04:10

Matt Weber

I know that Nutch will be adding pluggable backends and glad to see it. I had a need to integrate elasticsearch with Nutch 1.3. Code is posted here. Piggybacked off the (src/java/org/apache/nutch/indexer/solr) code.

https://github.com/ctjmorgan/nutch-elasticsearch-indexer

answered Oct 02 '22 04:10

ctjmorgan

Related questions
                            
                                dismax solr request handler MM , PS and Q.ALT
                            
                                What does Field.Index.NOT_ANALYZED_NO_NORMS mean
                            
                                Why does Lucene QueryParser needs an Analyzer
                            
                                How to solve the 'Lock obtain timed out' when using Solr plainly?
                            
                                Solr or Nhibernate Search
                            
                                Dynamic column names using DIH (DataImportHandler)
                            
                                How to properly escape OR and AND in lucene query?
                            
                                Getting error on a specific query
                            
                                getting similarity score in Solr
                            
                                Search in solr with special characters
                            
                                Lucene - Exact string matching
                            
                                Create index-patterns from console with Kibana 6.0 or 7+ (v7.0.1)
                            
                                Spelling correction for data normalization in Java
                            
                                Solr - how to "group by" and "limit"?
                            
                                How to boost fields in solr
                            
                                How to sort search results on multiple fields using a weighting function?
                            
                                Lucene query fails with mixed MUST/MUST_NOT
                            
                                Sorting in lucene.net
                            
                                Finding the position of search hits from Lucene
                            
                                Do documents in Lucene have to contain the same fields?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Have you indexed nutch crawl results using elasticsearch before?

Tags:

full-text-search

lucene

elasticsearch

web-crawler

nutch

neildf

People also ask

2 Answers

Matt Weber

ctjmorgan

Recent Activity

Donate For Us