Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to setup a tokenizer in elasticsearch

I have an embedded elasticsearch using the elasticsearch-jetty project, and I need to setup to use tokenizers better than the defaults. I want to use the keyword tokenizer.

I can't figure out for the life of me how to do this through the config files. Can anyone point me at a way to do this through config files?

As an aside, is it possible to adjust the index while it's up and running by doing a POST to the index? I'd really like to understand how to use this, thank you.

EDIT/update: I'm having trouble running curl -XPUT or -XPOST to localhost:9200 to try to adjust settings from some of the examples/forums I've seen when searching to help here, I'm getting results of 'No handler for uri [] and method [PUT]/[POST].

EDIT 2: Update, doing XPUT to an index works, but I get an error about "Index already exists". I know it exists, I want to update it.

like image 731
cdietschrun Avatar asked Feb 26 '13 00:02

cdietschrun


1 Answers

You can define mappings in the config files, but for most cases it is easier/more flexible to configure through the API. For example, this command will add a keyword/lowercase analyzer to the index test:

$ curl -XPUT localhost:9200/testindex/ -d '
{
  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "analyzer_keyword":{
                 "tokenizer":"keyword",
                 "filter":"lowercase"
              }
           }
        }
     }
  },
  "mappings":{
     "test":{
        "properties":{
           "title":{
              "analyzer":"analyzer_keyword",
              "type":"string"
           }
        }
     }
  }
}'
like image 109
Zach Avatar answered Oct 12 '22 23:10

Zach