Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I choose datatype of keyword or long / integer for document personId in Elasticsearch?

I have a document with personId (which is an int in DB).

I am not sure should I choose keyword or long when creating the document in Elasticsearch?

And in terms of space and performance, what is the benefit and disadvantages for each of them? (I only find the difference between text and keyword, instead of keyword vs. long)

like image 980
Xin Avatar asked Nov 28 '17 22:11

Xin


People also ask

What is keyword datatype in Elasticsearch?

The keyword family includes the following field types: keyword , which is used for structured content such as IDs, email addresses, hostnames, status codes, zip codes, or tags. constant_keyword for keyword fields that always contain the same value. wildcard for unstructured machine-generated content.

What is mapping in Elasticsearch?

Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. Each document is a collection of fields, which each have their own data type. When mapping your data, you create a mapping definition, which contains a list of fields that are pertinent to the document.


1 Answers

The fact that some data is numeric does not mean it should always be mapped as a numeric field. The way that Elasticsearch indexes numbers optimizes for range queries while keyword fields are better at term queries. Typically, fields storing identifiers such as an ISBN or any number identifying a record from another database are rarely used in range queries or aggregations. This is why they might benefit from being mapped as keyword rather than as integer or long.

Quoted from https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-search-speed.html#map-ids-as-keyword

like image 177
Shubhojit Saha Avatar answered Sep 18 '22 18:09

Shubhojit Saha