Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene.Net greater than/less than TermRangeQuery?

I have built a Lucene.net index of books. All is working well but I need to add another way to query the index and I cant figure out how to do it.

Basically each book has an age range that it is suitable for. This is expressed by two columns namely - minAge and maxAge. Both columns are integers.

I am indexing and storing these fields in the following loop

foreach (var catalogueBook in books)
{
    var book = new Book(catalogueBook.CatalogueBookNo,catalogueBook.IssueId);

    var strTitle = book.FullTitle ?? "";
    var strAuthor = book.Author ?? "";
    // create a Lucene document for this book
    var doc = new Document();

    // add the ID as stored but not indexed field, not used to query on
    doc.Add(
        new Field(
            "BookId",
            book.CatalogueBookNo.ToString(System.Globalization.CultureInfo.InvariantCulture),
            Field.Store.YES,
            Field.Index.NOT_ANALYZED_NO_NORMS,
            Field.TermVector.NO));

    // add the title and author as stored and tokenized fields, the analyzer processes the content
    doc.Add(
        new Field("FullTitle",
            strTitle.Trim().ToLower(), 
            Field.Store.YES, 
            Field.Index.ANALYZED, 
            Field.TermVector.NO));

    doc.Add(
        new Field("Author",
            strAuthor.Trim().ToLower(),
            Field.Store.YES,
            Field.Index.ANALYZED,
            Field.TermVector.NO));

    doc.Add(
        new Field("IssueId", 
            book.IssueId, 
            Field.Store.YES, 
            Field.Index.NOT_ANALYZED_NO_NORMS, 
            Field.TermVector.NO));

    doc.Add(
        new Field(
            "PublicationId",
            book.PublicationId.Trim().ToLower(),
            Field.Store.YES,
            Field.Index.NOT_ANALYZED_NO_NORMS,
            Field.TermVector.NO));

    doc.Add(
        new Field(
            "MinAge",
            book.MinAge.ToString("0000"),
            Field.Store.YES,
            Field.Index.NOT_ANALYZED_NO_NORMS,
            Field.TermVector.NO));

    doc.Add(
        new Field(
            "MaxAge",
            book.MaxAge.ToString("0000"),
            Field.Store.YES,
            Field.Index.NOT_ANALYZED_NO_NORMS,
            Field.TermVector.NO));

    doc.Add(new NumericField("Price",Field.Store.YES,true).SetDoubleValue(Convert.ToDouble(book.Price)));

    //Now we can loop through categories
    foreach(var bc in book.GetBookCategories())
    {
        doc.Add(
            new Field("CategoryId",
                bc.CategoryId.Trim().ToLower(),
                Field.Store.YES,
                Field.Index.NOT_ANALYZED_NO_NORMS,
                Field.TermVector.NO));
    }

    // add the document to the index
    indexWriter.AddDocument(doc);
}

// make lucene fast
indexWriter.Optimize();
}

As you can see I am padding out the minAge and maxAge fields as I thought it would be easiest to run a TermRangeQuery against it.

However I need to query both the minAge and maxAge columns with an Age to see if that Age falls with in the Age range defined by minAge and maxAge.

Sql would be

Select * 
From books 
where @age >= minAge and @age <= maxAge

Unfortunately I cannot see a way to do this. Is this even possible in Lucene.Net?

like image 429
wingyip Avatar asked Sep 28 '12 22:09

wingyip


2 Answers

You should be able to do this utilizing the range queries if memory serves. This is effectively the inverse of a standard range query, but you should be able to, something like:

+minAge:[* TO @age] +maxAge:[@age TO *]

Or, if your constructing the query objects, a RangeQuery (or better yet, NumericRangeQuery) with either the upper or lower bound null works as an open-ended range.

I've used the syntax above before, but support seems to be a bit...shaky on it. If that doesn't work, you can always just set an adequately low lower bound (0) and high upper bound (say, 1000), such as:

+minAge:[0000 TO @age] +maxAge:[@age TO 1000]

Which should be safe enough, barring any Methuselahs.

like image 117
femtoRgon Avatar answered Oct 23 '22 08:10

femtoRgon


Ended up doing this with the help of femtoRgon's answer above.

var q = new TermRangeQuery("MinAge", "0000",searchTerms.Age.ToString("0000"), true, true);
mainQuery.Add(q, BooleanClause.Occur.MUST);
q = new TermRangeQuery("MaxAge", searchTerms.Age.ToString("0000"),"9999", true, true);
mainQuery.Add(q, BooleanClause.Occur.MUST);

Wing

like image 32
wingyip Avatar answered Oct 23 '22 09:10

wingyip