Is there a way to get the date and time that an elastic search document was written?
I am running es queries via spark and would prefer NOT to look through all documents that I have already processed. Instead I would like read the only documents that were ingested between the last time the program ran and now.
What is the best most efficient way to do this?
I have looked at;
Elasticsearch version 5.6
By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval using the index. refresh_interval setting.
Upserts are "Update or Insert" operations. This means an upsert attempts to run your update script, but if the document does not exist (or the field you are trying to update doesn't exist), default values are inserted instead.
I posted the question on the elasticsearch discussion board and it appears using the ingest pipeline is the best option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With