Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch keyword and lowercase and aggregation

I have previously stored some fields with the mapping "keyword". But, they are case senstive.

To solve this, it is possible to use an analyzer, such as

{
  "index": {
    "analysis": {
      "analyzer": {
        "keyword_lowercase": {
          "tokenizer": "keyword",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  }
}

with the mapping

{
  "properties": {
    "field": {
      "type": "string",
      "analyzer": "keyword_lowercase"
    }
  }
}

But then the Aggregate on term does not work.

Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [a] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

It works on mapping type=keyword, but type=keyword does not allow analyzer it seems.

How do I index it as a lowercase keyword but still make it possible to use aggregation without setting fielddata=true?

like image 491
J2B Avatar asked Apr 19 '17 10:04

J2B


1 Answers

If you're using ES 5.2 or above, you can now leverage normalizers for keyword fields. Simply declare your index settings and mappings like this and you're good to go

PUT index
{
  "settings": {
    "analysis": {
      "normalizer": {
        "keyword_lowercase": {
          "type": "custom",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "type": {
      "properties": {
        "field": {
          "type": "keyword",
          "normalizer": "keyword_lowercase"
        }
      }
    }
  }
}
like image 147
Val Avatar answered Oct 23 '22 15:10

Val