Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene query - "Match exactly one of x, y, z"

I have a Lucene index that contains documents that have a "type" field, this field can be one of three values "article", "forum" or "blog". I want the user to be able to search within these types (there is a checkbox for each document type)

How do I create a Lucene query dependent on which types the user has selected?

A couple of prerequisites are:

  • If the user doesn't select one of the types, I want no results from that type.
  • The ordering of the results should not be affected by restricting the type field.

For reference if I were to write this in SQL (for a "blog or forum search") I'd write:

SELECT * FROM Docs
WHERE [type] in ('blog', 'forum')
like image 966
thatismatt Avatar asked Oct 12 '09 11:10

thatismatt


2 Answers

For reference, should anyone else come across this problem, here is my solution:

IList<string> ALL_TYPES = new[] { "article", "blog", "forum" };
string q = ...; // The user's search string
IList<string> includeTypes = ...; // List of types to include
Query searchQuery = parser.Parse(q);
Query parentQuery = new BooleanQuery();
parentQuery.Add(searchQuery, BooleanClause.Occur.SHOULD);
// Invert the logic, exclude the other types
foreach (var type in ALL_TYPES.Except(includeTypes))
{
    query.Add(
        new TermQuery(new Term("type", type)),
        BooleanClause.Occur.MUST_NOT
    );
}
searchQuery = parentQuery;

I inverted the logic (i.e. excluded the types the user had not selected), because if you don't the ordering of the results is lost. I'm not sure why though...! It is a shame as it makes the code less clear / maintainable, but at least it works!

like image 161
thatismatt Avatar answered Oct 23 '22 18:10

thatismatt


Add a constraints to reject documents that weren't selected. For example, if only "article" was checked, the constraint would be

-(type:forum type:blog)
like image 3
erickson Avatar answered Oct 23 '22 19:10

erickson