I'm trying to update my code from Lucene 3.4 to 4.1. I figured out the changes except one. I have code which needs to iterate over all term values for one field. In Lucene 3.1 there was an IndexReader#terms() method providing a TermEnum, which I could iterate over. This seems to have changed for Lucene 4.1 and even after several hours of search in the documentation I am not able to figure out how. Can someone please point me in the right direction?
Thanks.
Please follow Lucene 4 Migration guide::
How you obtain the enums has changed. The primary entry point is the
Fields
class. If you know your reader is a single segment reader, do this:Fields fields = reader.Fields(); if (fields != null) { ... }
If the reader might be multi-segment, you must do this:
Fields fields = MultiFields.getFields(reader); if (fields != null) { ... }
The
fields
may benull
(eg if the reader has no fields).Note that the
MultiFields
approach entails a performance hit onMultiReaders
, as it must merge terms/docs/positions on the fly. It's generally better to instead get the sequential readers (useoal.util.ReaderUtil
) and then step through those readers yourself, if you can (this is how Lucene drives searches).If you pass a
SegmentReader
toMultiFields.fields
it will simply returnreader.fields()
, so there is no performance hit in that case.Once you have a non-null Fields you can do this:
Terms terms = fields.terms("field"); if (terms != null) { ... }
The
terms
may benull
(eg if the field does not exist).Once you have a non-
null
terms you can get an enum like this:TermsEnum termsEnum = terms.iterator();
The returned
TermsEnum
will not be null.You can then
.next()
through theTermsEnum
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With