I have a Lucene index that contains documents that have a "type" field, this field can be one of three values "article", "forum" or "blog". I want the user to be able to search within these types (there is a checkbox for each document type)
How do I create a Lucene query dependent on which types the user has selected?
A couple of prerequisites are:
For reference if I were to write this in SQL (for a "blog or forum search") I'd write:
SELECT * FROM Docs
WHERE [type] in ('blog', 'forum')
For reference, should anyone else come across this problem, here is my solution:
IList<string> ALL_TYPES = new[] { "article", "blog", "forum" };
string q = ...; // The user's search string
IList<string> includeTypes = ...; // List of types to include
Query searchQuery = parser.Parse(q);
Query parentQuery = new BooleanQuery();
parentQuery.Add(searchQuery, BooleanClause.Occur.SHOULD);
// Invert the logic, exclude the other types
foreach (var type in ALL_TYPES.Except(includeTypes))
{
query.Add(
new TermQuery(new Term("type", type)),
BooleanClause.Occur.MUST_NOT
);
}
searchQuery = parentQuery;
I inverted the logic (i.e. excluded the types the user had not selected), because if you don't the ordering of the results is lost. I'm not sure why though...! It is a shame as it makes the code less clear / maintainable, but at least it works!
Add a constraints to reject documents that weren't selected. For example, if only "article" was checked, the constraint would be
-(type:forum type:blog)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With