Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr - highlight query phrase

Is it possible to highlight whole query terms? f.e. when I ask for "United States" I want to get:

<em>United States</em>

and not:

<em>United</em> <em>States</em>

I've searched the whole Internet for an answer, used all combinations of hl.mergeContiguous, hl.usePhrasesHighlighter and hl.highlightMultiTerm parameters and still cannot make it work.

my query is:

http://localhost:8983/solandra/idxPosts.proj350_139/select?q=post_text:"Janusz Palikot"&hl=true&hl.fl=post_text&hl.mergeContiguous=true&hl.usePhrasesHighlighter=true&hl.highlightMultiTerm=true

the answer is:

...
<arr name="post_text"><str>Tag: <em>janusz</em> <em>palikot</em> - Sowiniec: "Sowiniec"</str></arr>
...

my "post_text" field is:

<field name="post_text" type="text" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" required="true" />

my "text" type is:

<fieldType name="text" class="solr.TextField">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.TrimFilterFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" />
        <filter class="solr.ReversedWildcardFilterFactory" />
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.TrimFilterFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" />
    </analyzer>
</fieldType>

I also tried to use FastVectorHighlighter with hl.useFastVectorHighlighter=true but encountered an error:

Problem accessing /solandra/idxPosts.proj350_139/select. Reason:

    -6

java.lang.ArrayIndexOutOfBoundsException: -6
    at lucandra.TermFreqVector.getOffsets(TermFreqVector.java:224)
    at org.apache.lucene.search.vectorhighlight.FieldTermStack.<init>(FieldTermStack.java:100)
    at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getFieldFragList(FastVectorHighlighter.java:175)
    at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getBestFragments(FastVectorHighlighter.java:166)
    at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByFastVectorHighlighter(DefaultSolrHighlighter.java:509)
    at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:376)
    at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:116)
    at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
    ...

Can you help me, please?

like image 321
Tom_LK Avatar asked Nov 29 '11 09:11

Tom_LK


People also ask

What is highlighting in Solr?

Highlighting in Solr allows fragments of documents that match the user's query to be included with the query response.

What is SOLR facet?

Faceting is the arrangement of search results into categories based on indexed terms. Searchers are presented with the indexed terms, along with numerical counts of how many matching documents were found were each term.


1 Answers

For the phrase highlight, there is a Jira stilling waiting to get through to the Solr code.

like image 111
Jayendra Avatar answered Sep 16 '22 17:09

Jayendra