Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change default mapping of string to "not analyzed" in Elasticsearch

In my system, the insertion of data is always done through csv files via logstash. I never pre-define the mapping. But whenever I input a string it is always taken to be analyzed, as a result an entry like hello I am Sinha is split into hello,I,am,Sinha. Is there anyway I could change the default/dynamic mapping of elasticsearch so that all strings, irrespective of index, irrespective of type are taken to be not analyzed? Or is there a way of setting it in the .conf file? Say my conf file looks like

input {         file {           path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"           type => "promosms_dec15"           start_position => "beginning"           sincedb_path => "/dev/null"       } } filter {      csv {         columns => ["Comm_Plan","Queue_Booking","Order_Reference","Multi_Ordertype"]         separator => ","     }       ruby {           code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"     }  } output {       elasticsearch {          action => "index"         host => "localhost"         index => "promosms-%{+dd.MM.YYYY}"         workers => 1     } } 

I want all the strings to be not analyzed and I don't mind it being the default setting for all future data to be inserted into elasticsearch either

like image 812
Sagnik Sinha Avatar asked Dec 15 '14 11:12

Sagnik Sinha


People also ask

How do I remove a mapping field in Elasticsearch?

You cannot delete the status field from this mapping. If you really need to get rid of this field, you'll have to create another mapping without status field and reindex your data.


2 Answers

Just create a template. run

curl -XPUT localhost:9200/_template/template_1 -d '{     "template": "*",     "settings": {         "index.refresh_interval": "5s"     },     "mappings": {         "_default_": {             "_all": {                 "enabled": true             },             "dynamic_templates": [                 {                     "string_fields": {                         "match": "*",                         "match_mapping_type": "string",                         "mapping": {                             "index": "not_analyzed",                             "omit_norms": true,                             "type": "string"                         }                     }                 }             ],             "properties": {                 "@version": {                     "type": "string",                     "index": "not_analyzed"                 },                 "geoip": {                     "type": "object",                     "dynamic": true,                     "path": "full",                     "properties": {                         "location": {                             "type": "geo_point"                         }                     }                 }             }         }     } }' 
like image 109
Sagnik Sinha Avatar answered Oct 20 '22 10:10

Sagnik Sinha


You can query the .raw version of your field. This was added in Logstash 1.3.1:

The logstash index template we provide adds a “.raw” field to every field you index. These “.raw” fields are set by logstash as “not_analyzed” so that no analysis or tokenization takes place – our original value is used as-is!

So if your field is called foo, you'd query foo.raw to return the not_analyzed (not split on delimiters) version.

like image 30
Banjer Avatar answered Oct 20 '22 11:10

Banjer