Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon Kinesis KPL vs AWS SDK pros and cons

The scenario is I would be writing large volumes of data ( terabytes per day) to kinesis stream.I want to know which is a better way to achieve high write throughput. I am considering the below two options for producer clients.

Option 1: using Kinesis producer library( KPL).

or

option 2: AWS SDK (api).

I know KPL is an abstraction used on top of aws sdk, so it basically boils down to (KPL with AWS-SDK) or just AWS-SDK. From what I have researched it seems to me AWS-SDK does not provide ability to aggregate multiple records into a single put, whereas KPL does support this aggregation ( please correct me if this is wrong).

Both PutRecords( from Kinesis Data Streams API ) and KPL(using aggregation) provide hight write throughput, the question is which of the two options is better and why?. In a nutshell interested in knowing which will be faster in terms of writing data to kinesis stream, once it is written to stream I do not care how it is read.Also interested in knowing retry mechanism difference in both cases and asynchronous write performance.

like image 791
yin yang Avatar asked Jan 31 '26 01:01

yin yang


1 Answers

Yes, so there are two main difference between the SDK and KPL. Firstly, SDK sends records synchronously, without latency, whereas KPL allows for batching (aggregation and collection) which is at the cost of some latency determined by the RecordMaxBufferedTime, which helps maximize efficiency and throughput. Secondly, for KPL you need to deploy using Java whereas SDK allows for use of CLI or the Boto3 library for that matter which uses the SDK to help call APIs in python or other programming languages. Please refer to the API reference.

If your approach is language agnostic and no issue with a little latency, go for KPL. However, if you want communication to remain synchronous, go for the API and choose whatever language you prefer.

Conclusively, SDK is the basic operation, while KPL is built on top of that which includes the batching/aggregation/retry capability ready for you. For this reason KPL is higher latency as it has more built-in functionality compared to the SDK.

like image 158
Hamza E. Khan Avatar answered Feb 01 '26 15:02

Hamza E. Khan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!