Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting in Solr

Tags:

solr

I am storing the following document in Solr:

  doc {
    id: string; // this is a unique string that looks like an md5 result
    job_id: string; // this also looks like an md5 result -- this is not unique
    doc_id: number; // this is a long number -- this is not unique
    text: string; // this is stored, indexed text -- this is not unique
  }

Now what I want to do is count the number of docs (doc_id) that have the text foo in them. So if this was SQL I would want to issue something like this:

SELECT count(distinct doc_id)
FROM Doc
WHERE text like '%foo%';

Thanks in advance.

like image 616
user1172468 Avatar asked Sep 25 '12 17:09

user1172468


2 Answers

To make it work (using Result Grouping/Filed collapsing) you need some conditions to fulfil.

  • You have to make your text query ("%foo%") to work in regular search
  • doc_id have to be string, you can have copy of that field and call it doc_id_str

Then you can make request like that:

/select/?q=foo&rows=0&group=true&group.field=doc_id_str&group.limit=0&group.ngroups&group.format=simple&wt=json

This query works for me. How would it work for you, depends on your index and size of it. Please ask if you need some more guidance.

like image 122
Fuxi Avatar answered Nov 14 '22 08:11

Fuxi


Similar operation to count (distinct fieldName) is not possible in Solr right now. There are issues (SOLR-1814 and SOLR-2242) related to this problem in Jira. Maybe reading comments in the issues will help you.

like image 2
Parvin Gasimzade Avatar answered Nov 14 '22 09:11

Parvin Gasimzade