Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch "starts with" first word in phrases

I try to implement an A - Z navigation for my content with Elasticsearch. What I need, is displaying all results which begins with e.g. a,b,c,... etc.

I've tried:

"query": {
        "match_phrase_prefix" : {
        "title" : {
            "query" : "a"
        }
      }
    }

The query mentioned above also display results, where within the string a word begins with a. Example:

"title": "Apfelpfannkuchen",

"title": "Affogato",

"title": "Kalbsschnitzel an Aceto Balsamico",

I want to display only phrase where the FIRST word begins with a.

Here the mapping I use:

$params = array(
            'index' => 'my_index',
            'body' => array(
                'settings' => array(
                    'number_of_shards' => 1,
                    'index' => array(
                        'analysis' => array(
                            'filter' => array(
                                'nGram_filter' => array(
                                    'type' => 'nGram',
                                    'min_gram' => 2,
                                    'max_gram' => 20,
                                    'token_chars' => array('letter', 'digit', 'punctuation', 'symbol')
                                )
                            ),
                            'analyzer' => array(
                                'nGram_analyzer' => array(
                                    'type' => 'custom',
                                    'tokenizer' => 'whitespace',
                                    'filter' => array('lowercase', 'asciifolding', 'nGram_filter')
                                ),
                                'whitespace_analyzer' => array(
                                    'type' => 'custom',
                                    'tokenizer' => 'whitespace',
                                    'filter' => array('lowercase', 'asciifolding')
                                ),
                                'analyzer_startswith' => array(
                                    'tokenizer' => 'keyword',
                                    'filter' => 'lowercase'
                                )
                            )
                        )
                    )
                ),
                'mappings' => array(
                    'tags' => array(
                        '_all' => array(
                            'type' => 'string',
                            'index_analyzer' => 'nGram_analyzer',
                            'search_analyzer' => 'whitespace_analyzer'
                        ),
                        'properties' => array()

                    ),
                    'posts' => array(
                        '_all' => array(
                            'index_analyzer' => 'nGram_analyzer',
                            'search_analyzer' => 'whitespace_analyzer'
                        ),
                        'properties' => array(
                            'title' => array(
                                'type' => 'string',
                                'index_analyzer' => 'analyzer_startswith',
                                'search_analyzer' => 'analyzer_startswith'
                            )
                        )
                    )
                )
            )
        );
like image 456
alin Avatar asked Apr 20 '15 07:04

alin


People also ask

What is Elasticsearch full text search?

Overview Full-text search queries and performs linguistic searches against documents. It includes single or multiple words or phrases and returns documents that match search condition. ElasticSearch is a search engine based on Apache Lucene, a free and open-source information retrieval software library.

Should I take the Elasticsearch “starts-with” advice?

We no longer recommend taking its advice. “Starts-with” functionality is common in search applications. Either you want to return results as the user types (ala Google Instant) or you simply want to search partial phrases and return any matches. This can be accomplished in Elasticsearch several ways.

How does Elasticsearch sort its matching results?

By default, ElasticSearch sorts matching results by their relevance score, that is, by how well each document matches the query. Note, that the score of the second result is small relative to the first hit, indicating lower relevance. 6. Fuzzy Search Fuzzy matching treats two words that are “fuzzily” similar as if they were the same word.

How to do proximity search in Elasticsearch?

In order to achieve proximity search, we simply need to define the search window, so how far we allow the terms to be. This is called slop in Elasticsearch/Lucene terminology. The change to the previous code is really minimal, for example for a slop/window of 3 terms: The result of the query:


1 Answers

If you are using default mapping then it will not work for you.

You need to use keyword tokenizer and lowercase filter in mapping.

Mapping Will be :

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "analyzer_startswith": {
                        "tokenizer": "keyword",
                        "filter": "lowercase"
                    }
                }
            }
        }
    },
    "mappings": {
        "test_index": {
            "properties": {
                "title": {
                    "search_analyzer": "analyzer_startswith",
                    "index_analyzer": "analyzer_startswith",
                    "type": "string"
                }
            }
        }
    }
}

Search query on test_index :

{
    "query": {
        "match_phrase_prefix": {
            "title": {
                "query": "a"
            }
        }
    }
}

It will return all post title starting with a

like image 151
Roopendra Avatar answered Sep 22 '22 20:09

Roopendra