Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the key word "Snowball" mean in Elastic search?

When i'm using elastic search , i've to indecies it first. In this process i blindly using "SNOWBALL" , "KEYWORD" n analyzer coloumn. What is the main use of Analyzer (I know it is a booster) but it helps me in elastic search n What is the key word "Snowball" mean?

'data.description': {'analyzer': 'snowball', 'type': 'string'},
'data.title': {'analyzer': 'snowball', 'type': 'string'}
like image 753
Paarudas Avatar asked Feb 23 '12 12:02

Paarudas


People also ask

What is Snowball analyzer?

The Snowball analyzer converts words into language and code set specific stem words. The Snowball analyzer is similar to the Standard analyzer except that is converts words to stem words. The Snowball analyzer processes text characters in the following ways: Converts words to stem word tokens.

What is analyzers in elastic search?

In a nutshell an analyzer is used to tell elasticsearch how the text should be indexed and searched. And what you're looking into is the Analyze API, which is a very nice tool to understand how analyzers work. The text is provided to this API and is not related to the index.

Does Elasticsearch support Lemmatization?

Any of the Elasticsearch stemming algorithms that use dictionaries are actually Lemmatization based approaches. There are some open source lemmatization plugins for Elasticsearch, e.g. LemmaGen, and a list of other open source lemmatizers.

Does Elasticsearch do Stemming?

In Elasticsearch, stemming is handled by stemmer token filters. These token filters can be categorized based on how they stem words: Algorithmic stemmers, which stem words based on a set of rules.


2 Answers

Analyzers are process which extracts indexable terms from text given for indexing.

For example

In the text "i am a dinosaur from modern age" When this is analyzed against "stop word" analyzer only dinosaur, modern and age keywords are stored in the index. Which means if you search for "am", though the word is present in the text you indexed, it wont point to that indexed document.

Similarly snowball is a combination of stopword , lowercase and standard analyzer - https://www.elastic.co/guide/en/elasticsearch/reference/2.4/analysis-snowball-analyzer.html

like image 57
Vineeth Mohan Avatar answered Dec 01 '22 20:12

Vineeth Mohan


The snowball filter is used to stem words based on a specific stemmer. A stemmer uses some rules to determine the proper stem of a word. That means different stemmers may return different results.

For example, the words “indexing”, “indexable”, “indexes”, “indexation”, etc will be stemmed as “index”. It’s particularly interesting to retrieve a document with the title “Make my string indexable” when you search “Indexing a string”. (c)

To configure this filter see https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-snowball-tokenfilter.html

P.S. http://snowball.tartarus.org/ | http://snowballstem.org/

like image 28
Evgeniy Tkachenko Avatar answered Dec 01 '22 20:12

Evgeniy Tkachenko