I can see that docs say that we can set ttl
on a document but not on index/indices. Also wanted to know if it has any performance impact if we set ttl
.
The _ttl mappings have been removed. As a replacement for _ttl mappings, we recommend using ILM to create time-based indices.
Index per document type From Elasticsearch version 6.0 by default index doesn't allow multiple types per index. The better option is to always have one document type per index.
_ttl
-like approach is deprecated now (cause of performance impact of reiteration over and over) and Elastic introduced index lifecycle management (ILM)
So what you would like to do instead now is a dynamic index creation each day for instance with a date-specific name pattern e.g. my-app-log-yyyy-mm-dd
and ILM policy that will handle deletion of indexes that are out of a wanted timeframe
Besides that Elastic gives you API for managing such policies i.e POST or GET hence you can automate that within your application to avoid manual work and keep it all nice and consistent.
Indexes themselves usually easily managed by loggers, Logback
for instance allow you to create dynamic indexes when you define its name in the configuration in the following way:
<index>my-app-logs-%date{yyyy-MM-dd}</index
_ttl
is enabled per index, but the expiration works per document.
If you want your indices to "expire", delete them. Much more simple and performant.
And yes, _ttl
has a performance impact.
The Elasticsearch "way" of dealing with "expired" data is to create time-based indices. Meaning, for each day or each week you create an index. Index everything belonging to that day/week in that index. You decide how many days you want to keep around and stick to that number.
Let's say that you want to keep the data for 7 days. In the 8th day you create the new index, as usual, then you delete the index from 8 days before. All the time you'll have in your cluster 7 indices. The ttl
mechanism checks every indices.ttl.interval
(60 seconds by default) for expired documents, it creates bulk requests out of them and deletes them. This means unnecessary requests coming to the cluster.
Instead, deleting an index is very easy and quick.
Take a look at this and how to easily manage time based indices with Curator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With