When i'm using elastic search , i've to indecies it first. In this process i blindly using "SNOWBALL" , "KEYWORD"
n analyzer coloumn. What is the main use of Analyzer (I know it is a booster) but it helps me in elastic search n What is the key word "Snowball" mean?
'data.description': {'analyzer': 'snowball', 'type': 'string'}, 'data.title': {'analyzer': 'snowball', 'type': 'string'}
The Snowball analyzer converts words into language and code set specific stem words. The Snowball analyzer is similar to the Standard analyzer except that is converts words to stem words. The Snowball analyzer processes text characters in the following ways: Converts words to stem word tokens.
In a nutshell an analyzer is used to tell elasticsearch how the text should be indexed and searched. And what you're looking into is the Analyze API, which is a very nice tool to understand how analyzers work. The text is provided to this API and is not related to the index.
Any of the Elasticsearch stemming algorithms that use dictionaries are actually Lemmatization based approaches. There are some open source lemmatization plugins for Elasticsearch, e.g. LemmaGen, and a list of other open source lemmatizers.
In Elasticsearch, stemming is handled by stemmer token filters. These token filters can be categorized based on how they stem words: Algorithmic stemmers, which stem words based on a set of rules.
Analyzers are process which extracts indexable terms from text given for indexing.
For example
In the text "i am a dinosaur from modern age" When this is analyzed against "stop word" analyzer only dinosaur, modern and age keywords are stored in the index. Which means if you search for "am", though the word is present in the text you indexed, it wont point to that indexed document.
Similarly snowball is a combination of stopword , lowercase and standard analyzer - https://www.elastic.co/guide/en/elasticsearch/reference/2.4/analysis-snowball-analyzer.html
The snowball filter is used to stem words based on a specific stemmer. A stemmer uses some rules to determine the proper stem of a word. That means different stemmers may return different results.
For example, the words “indexing”, “indexable”, “indexes”, “indexation”, etc will be stemmed as “index”. It’s particularly interesting to retrieve a document with the title “Make my string indexable” when you search “Indexing a string”. (c)
To configure this filter see https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-snowball-tokenfilter.html
P.S. http://snowball.tartarus.org/ | http://snowballstem.org/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With