Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What format does Elasticsearch use for its ids?

Elastic search ids that are generated by the system look like this.

"_id": "AU9HiR3lEVul15o3bNYl"

What format is that? Also does anyone know of a library to generate ids like that?

like image 941
Donny V. Avatar asked Sep 02 '15 19:09

Donny V.


People also ask

What is Elasticsearch document ID?

Each document has an _id that uniquely identifies it, which is indexed so that documents can be looked up either with the GET API or the ids query. The _id can either be assigned at indexing time, or a unique _id can be generated by Elasticsearch. This field is not configurable in the mappings.

Can Elasticsearch Id be string?

ES document ids are always stored as strings, even if you give an integer at indexing time.

What are Elasticsearch documents?

Documents are JSON objects that are stored within an Elasticsearch index and are considered the base unit of storage. In the world of relational databases, documents can be compared to a row in table.

How do you create an Elasticsearch document?

Creating a document in Elasticsearch with a pre-defined ID As you can see in the above, we are making a simple PUT request using the curl command to our Elasticsearch server that is located at localhost on port 9200. Then, we are defining an index that we want this document to be located in which is mynewindex.


2 Answers

Before v1.4.0 elasticsearch was using UUID-based ids. These ids were Base64 encoded version of a Version 4.0-compatible UUID as defined by RFC4122. In order to encode the ids an URL-safe Base64 encoding was used (see section 4 of RFC3548) and the last two "=" signs were removed (because Base64 encoding of 16 bytes would always generate two "=" at the end).

Unfortunately, completely random ids were less then ideal from performance perspective. So, starting with version 1.4.0 elasticsearch switched to time-based ids. The new id format is essentially a version of flake ids except it is using 6 (not 8) bytes for timestamp and 3 (not 2) bytes for the sequence number.

The id in the question AU9HiR3lEVul15o3bNYl looks like a time-base id that was generated somewhere in the middle of Aug 2015.

like image 189
imotov Avatar answered Oct 12 '22 01:10

imotov


Autogenerated IDs are 22 character long, URL-safe, Base64-encoded string universally unique identifiers, or UUIDs, although it looks like your ID is 20 characters.

Some more .NET info here I think, looks like Guid.NewGuid will work. What is the string length of a GUID?

like image 37
CJW Avatar answered Oct 12 '22 03:10

CJW