Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get default stop word list in elastic search

I am trying to find out what the predefined stop word list for elastic search are, but i have found no documented read API for this.

So, i want to find the word lists for this predefined variables (_arabic_, _armenian_, _basque_, _brazilian_, _bulgarian_, _catalan_, _czech_, _danish_, _dutch_, _english_, _finnish_, _french_, _galician_, _german_, _greek_, _hindi_, _hungarian_, _indonesian_, _irish_, _italian_, _latvian_, _norwegian_, _persian_, _portuguese_, _romanian_, _russian_, _sorani_, _spanish_, _swedish_, _thai_, _turkish_)

I found the english stop word list in the documentation, but I want to check if it is the one my server really uses and also check the stop word lists for other languages.

like image 551
Paul Weber Avatar asked Nov 21 '16 11:11

Paul Weber


1 Answers

The stop words used by the English Analyzer are the same as the ones defined in the Standard Analyzer, namely the ones you found in the documentation.

The stop word files for all other languages can be found in the Lucene repository in the analysis/common/src/resources/org/apache/lucene/analysis folder.

like image 163
Val Avatar answered Sep 27 '22 22:09

Val