Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TRIM_HORIZON vs LATEST

I can't find in the formal documentation of AWS Kinesis any explicit reference between TRIM_HORIZON and the checkpoint, and also any reference between LATEST and the checkpoint.

Can you confirm my theory:

  • TRIM_HORIZON - In case the application-name is new, then I will read all the records available in the stream. Else, application-name was already used, then I will read from my last checkpoint.

  • LATEST - In case the application-name is new, then I will read all the records in the stream which added after I subscribed to the stream. Else, application-name was already used, I will read messages from my last checkpoint.

  • The difference between TRIM_HORIZON and LATEST is only in case the application-name is new.

like image 547
Ida Amit Avatar asked Apr 09 '18 08:04

Ida Amit


People also ask

What is trim horizon in Kinesis?

TRIM_HORIZON : Start streaming at the last untrimmed record in the shard, which is the oldest data record in the shard. LATEST : Start streaming just after the most recent record in the shard, so that you always read the most recent data in the shard. Type: String.

What is a shard iterator?

A shard iterator specifies the shard position from which to start reading data records sequentially. The position is specified using the sequence number of a data record in a shard.

What is the minimum retention period of data record in Kinesis stream?

What is the retention period supported by Kinesis Data Streams? The default retention period of 24 hours covers scenarios where intermittent lags in processing require catch-up with the real-time data.

What is a shard iterator DynamoDB?

A shard iterator provides information about how to retrieve the stream records from within a shard. Use the shard iterator in a subsequent GetRecords request to read the stream records from the shard. A shard iterator expires 15 minutes after it is returned to the requester.


2 Answers

AT_TIMESTAMP

-- from specific time stamp

TRIM_HORIZON

-- all the available messages in Kinesis stream from the beginning (same as earliest in Kafka)

LATEST

-- from the latest messages , i.e current message that just came into Kinesis/Kafka and all the incoming messages from that time onwords

like image 146
Suresh Avatar answered Oct 07 '22 23:10

Suresh


From GetShardIterator documentation (which lines up with my experience using Kinesis):

In the request, you can specify the shard iterator type AT_TIMESTAMP to read records from an arbitrary point in time, TRIM_HORIZON to cause ShardIterator to point to the last untrimmed record in the shard in the system (the oldest data record in the shard), or LATEST so that you always read the most recent data in the shard.

Basically, the difference is whether you want to start from the oldest record (TRIM_HORIZON), or from "right now" (LATEST - skipping data between latest checkpoint and now).

like image 16
Krease Avatar answered Oct 07 '22 22:10

Krease