A I need to perform a phrase search. On the search results Im getting the exact phrase matches but looking at the highlighted parts I see that the phrase are tokenized i.e This is what I get when I search for the prase "Day 1" :
<arr name="post">
  <str><em>Day</em> <em>1</em>   We have begun a new adventure! An early morning (4:30 a.m.) has found me meeting with</str>
</arr>
This is what I want to receive as a result:
    <arr name="post">
  <str><em>Day 1</em>   We have begun a new adventure! An early morning (4:30 a.m.) has found me meeting with</str>
</arr>
The query I m doing is this: Admin console:
q = day 1 
fq = post:"day 1" OR title:"day 1"
hl = true
hl.fl =title,post
select?q=day+1&fq=post%3A%22day+1%22+OR+title%3A%22day+1%22&wt=xml&indent=true&hl=true&hl.fl=title%2Cpost&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E
Theese are my fields:
     <field name="post" type="text_general" indexed="true" stored="true" required="true" multiValued="false" />
      <field name="post" type="text_general" indexed="true" stored="true" required="true" multiValued="false" />
This is the solr schema section for my fied type text_general:
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.GreekStemFilterFactory"/>
    <filter class="solr.GreekLowerCaseFilterFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>
B) I can see in the highlight section more disturbing results i.e highlighting not the whole word as expected but single fragments: .where you get to see all of Athens ...  <em>Day</em> 2 -  Carmens
I dont want to see this result  in the highlighted section (Only need to see both words "Day 1"). Any ideas ?
I m reading the Solr highlight section but .. really... there is not even 1 example!!!
The parameter that needed to be inserted was hl.q which basically means "I want this phrase to be highlighted" and hl.usePhraseHighlighter=true  and hl.useFastVectorHighlighter=true
So by adding to my original query : &hl.q="Day+1"&hl.usePhraseHighlighter=true&hl.useFastVectorHighlighter=true worked.
for B) I changed fq = post:"day 1" OR title:"day 1" to fq = post:"day 1". I know that the latter is less from what I need be neverthless is works.
fastVectorHighliter configuration that was used:
   <field name="post" type="text_general" indexed="true" stored="true" required="true" multiValued="false"  termVectors="true" termPositions="true" termOffsets="true"/>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With