ShingleFilterFactory affects size of highlighted section in Solr

Question

Adding ShingleFilterFactory to a type in solr (index time) does result in changing behavior when queering with highlighting.

Sample Text: "in a ship a dragon was in a box"

Without ShingleFilterFactory both "in" tokens will be highlighted separately.

<em>in</em> a ship a dragon was <em>in</em> a box

With it the whole segment is returned as a single highlight.

<em>in a ship a dragon was in</em>

Why is it that the use of 'ShingleFilterFactory' does affect the highlighting?

EDIT:

Adding schema info as requested:

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <charFilter class="solr.HTMLStripCharFilterFactory"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="true"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

Using text_general, which contains the shingle filter, results in unusually large highlight fields as described above.

alexf · Accepted Answer

Maybe you can use this highlighter:

https://issues.apache.org/jira/browse/LUCENE-1522

The problem that you are pointing is known and some patches are available:

https://issues.apache.org/jira/browse/LUCENE-1489

Edit: The second link is the same that Bereng sent.

Bereng · Answer

Won't help much but will shed some light:

https://issues.apache.org/jira/browse/LUCENE-1489

ShingleFilterFactory affects size of highlighted section in Solr

Tags:

solr

highlighting

Th 00 mÄ s

2 Answers

alexf

Bereng

Recent Activity

Donate For Us

ShingleFilterFactory affects size of highlighted section in Solr

Tags:

solr

highlighting

Th 00 mÄ s

2 Answers

alexf

Bereng

Related questions

Recent Activity

Donate For Us