Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read and write transactions in Amazon Kinesis

I'm new to Kinesis, so this might seem like a very basic question, but I have not been able to find a clear answer to what the actual difference is between a read and write transaction in a Kinesis stream.

Relevant parts from Amazon Kinesis Limits:

  • GetShardIterator can provide up to 5 transactions per second per open shard.
  • GetRecords can retrieve 10 MB of data.
  • Each shard can support up to 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second.
  • Each shard can support up to 1024 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys). This write limit applies to operations such as PutRecord and PutRecords.

It clearly mentions 5 reads and 1024 writes per second per shard. Why are reads so much more expensive than writes, or is there a crucial Kinesis concept here I haven't grasped?

like image 394
KennethJ Avatar asked Jun 10 '15 21:06

KennethJ


People also ask

What are the different ways by which data can be written into and read from AWS Kinesis streams?

You can add data to a Kinesis data stream through PutRecord and PutRecords operations, Amazon Kinesis Producer Library (KPL), or Amazon Kinesis Agent.

Which data services can Kinesis write to?

Amazon Kinesis Data Firehose is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon OpenSearch Service, Splunk, and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers, ...

Can Kinesis data stream write to S3?

Kinesis Data Analytics for Apache Flink cannot write data to Amazon S3 with server-side encryption enabled on Kinesis Data Analytics. You can create the Kinesis stream and Amazon S3 bucket using the console.


1 Answers

Kinesis enables you to ingest granular data into a stream and read batches of records to process the information. So the volume of megabytes you can read per second is much more important than the number of read transactions you get per shard. For example, you might have a busy website generating thousand of views per minute and an EMR cluster to process your access logs. In this scenario, you will have much more write events than read events. The same is valid for clickstreams, financial transactions, social media feeds, IT logs, and location-tracking events, etc.

like image 110
Uilton Dutra Avatar answered Oct 07 '22 11:10

Uilton Dutra