How to get the matching spans of a Span Term Query in Lucene 5?

Question

In Lucene to get the words around a term it is advised to use Span Queries. There is good walkthrough in http://lucidworks.com/blog/accessing-words-around-a-positional-match-in-lucene/

The spans are supposed to be accessed using the getSpans() method.

SpanTermQuery fleeceQ = new SpanTermQuery(new Term("content", "fleece"));
Spans spans = fleeceQ.getSpans(searcher.getIndexReader());

Then in Lucene 4 the API changed and the getSpans() method got more complex, and finally, in the latest Lucene release (5.3.0), this method was removed (apparently moved to the SpanWeight class).

So, which is the current way of accessing spans matched by a span term query?

Apurv · Accepted Answer

The way to do it would be as follows.

LeafReader pseudoAtomicReader = SlowCompositeReaderWrapper.wrap(reader);
Term term = new Term("field", "fox");
SpanTermQuery spanTermQuery = new SpanTermQuery(term);
SpanWeight spanWeight = spanTermQuery.createWeight(is, false);
Spans spans = spanWeight.getSpans(pseudoAtomicReader.getContext(), Postings.POSITIONS);

The support for iterating over the spans via span.next() is also gone in version 5.3 of Lucene. To iterate over the spans you can do

int nxtDoc = 0;
while((nxtDoc = spans.nextDoc()) != spans.NO_MORE_DOCS){
  System.out.println(spans.toString());
  int id = nxtDoc;
  System.out.println("doc_id="+id);
  Document doc = reader.document(id);
  System.out.println(doc.getField("field"));
  System.out.println(spans.nextStartPosition());
  System.out.println(spans.endPosition());
}

How to get the matching spans of a Span Term Query in Lucene 5?

Tags:

lucene

Julián Solórzano

1 Answers

Apurv

Recent Activity

Donate For Us

How to get the matching spans of a Span Term Query in Lucene 5?

Tags:

lucene

Julián Solórzano

1 Answers

Apurv

Related questions

Recent Activity

Donate For Us