Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is solr clustering component

I just went through solr wiki page for clustering. But i am not getting what is the benefit of using clustering. Can anyone tell me what is actually clusering and what its use in indexing and searching.

Please reply

like image 247
Romi Avatar asked Jun 29 '11 11:06

Romi


1 Answers

Clustering is a statistical technique to group data in to groups 'which belong together'. In Solr specifically, this means that it will try to group the results for a certain query and label those groups.

This could give you additional information in the nature of the results returned. Example: if you search for 'Python' on a very broad set of documents, the clustering component might create groups for 'The Python programming language', 'Python the snake', etc.

Have a look at the Carrot2 demo site for a demo: (Carrot2 is the clustering engine shipped with Solr)

http://search.carrot2.org/stable/search

Solr's clustering components (Carrot2) clusters the documents using the text fields which are returned by Solr in a result list. (The fields used are configurable.) It uses the terms in the text field to build the clusters and label them.

There is a very interesting presentation on the Carrot2 website:

http://project.carrot2.org/publications/carrot2-dresden-2007.pdf

like image 173
JanRavn Avatar answered Oct 21 '22 12:10

JanRavn