Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene.NET 2.9 and BitArray/DocIdSet

I found a great example on grabbing facet counts on a base query. It stores the bitarray of the base query to improve the performance each time the a facet gets counted.

        var genreQuery = new TermQuery(new Term("genre", genre));
        var genreQueryFilter = new QueryFilter(genreQuery);
        BitArray genreBitArray = genreQueryFilter.Bits(searcher.GetIndexReader());
        Console.WriteLine("There are " + GetCardinality(genreBitArray) + " document with the genre " + genre);

        // Next perform a regular search and get its BitArray result
        Query searchQuery = MultiFieldQueryParser.Parse(term, new[] {"title", "description"}, new[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD}, new StandardAnalyzer());
        var searchQueryFilter = new QueryFilter(searchQuery);
        BitArray searchBitArray = searchQueryFilter.Bits(searcher.GetIndexReader());
        Console.WriteLine("There are " + GetCardinality(searchBitArray) + " document containing the term " + term);

The only problem is that I am using a newer version of Lucene.NET (2.9) and Filter.Bits is obsolete. We are told to use DocIdSet instead (rather than BitArray).

I cannot found out how to do the bitArray.And(bitArray) with a docIdSet. I looked in reflector and found OpenIdSet which has And operations. Not sure if OpenIdSet is the route to go, I'm just stating.

Thanks in advance!

like image 569
Paul Knopf Avatar asked Jun 01 '10 00:06

Paul Knopf


1 Answers

Found it out.

            var productsDISI = new OpenBitSetDISI(productResults.Iterator(), 25000);
            var termQuery = new TermQuery(new Term("Spec" + expectedFacet.SpecificationId, expectedFacet.SpecificationOptionId.ToString()));
            var termQueryFilter = new QueryWrapperFilter(termQuery);
            var termIterator = termQueryFilter.GetDocIdSet(productReader).Iterator();
            productsDISI.InPlaceAnd(termIterator);
            var total = productsDISI.Cardinality();

turns out to be much faster too.

like image 57
Paul Knopf Avatar answered Nov 02 '22 17:11

Paul Knopf