Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon Kinesis vs AWS Manage Service Kafka (MSK) - (Connect from on-prem)

I'm evaluating AWS Kinesis vs Managed Service Kafka (MSK). Our requirement is sending some messages (JSON) to AWS to from the on-prem system (system develop using c++). Then we need to persist above messages into the relational database like PostgreSQL, and same time we need to stream above data into some other microservices (java) which hosted in AWS.

I have the following queries:

i) How can I access(connect and send messages) to AWS Kinesis from my on-premise system? Is there any C++ API supporting that? (There are java client API, but our on-prem system written on C++)

ii) How can I access(connect and send messages) to AWS MSK from my on-premise system?

iii) Is it possible to integrate MSK with other AWS service (e.g lambda, Redshift, EMR, etc)?

iv) To persist data into a database can we use AWS lambda? (AWS Kinesis supporting that functionality, what about AWS MSK)

v) Our message rate is 50msg/second and what is the cost-effective solution?

like image 962
HASH Avatar asked Mar 05 '20 10:03

HASH


People also ask

Which is better Kafka or Kinesis?

Performance-wise, Kafka has a clear advantage over Kinesis. Let's not forget that Kafka consistently gets better throughput than Kinesis. Kafka can reach a throughput of 30k messages per second, whereas the throughput of Kinesis is much lower, but still solidly in the thousands.

What is the difference between Kafka and Kinesis?

Kinesis Comparison. Kafka is more highly configurable compared to Kinesis. With Kafka, it's possible to write data to a single server. On the other hand, Kinesis is designed to write simultaneously to three servers – a constraint that makes Kafka a better performing solution.

Is AWS Kinesis similar to Kafka?

Both Apache Kafka and Amazon Kinesis are data ingest frameworks/platforms that are meant to help with ingesting data durably, reliably, and with scalability in mind. Both offerings share common core concepts, including replication, sharding/partitioning, and application components (consumer and producers).

Can Kinesis read from Kafka?

Kafka-Kinesis-Connector for Kinesis is used to publish messages from Kafka to Amazon Kinesis Streams. Kafka-Kinesis-Connector can be executed on on-premise nodes or EC2 machines. It can be executed in standalone mode as well as distributed mode.

What is AWS managed streaming for Apache Kafka?

In the Summer of 2019, AWS announced the release of Managed Streaming for Apache Kafka (MSK). Apache Kafka is a distributed open source streaming platform developed by LinkedIn and later open sourced with the Apache Software Foundation. MSK takes away the operational burden of managing an Apache Kafka cluster.

What is the difference between Kafka and AWS kinesis?

The Kafka Cluster consists of many Kafka Brokers on many servers. Broker sometimes refers to more of a logical system or as Kafka as a whole. AWS Kinesis comprises of key concepts such as Data Producer, Data Consumer, Data Stream, Shard, Data Record, Partition Key, and a Sequence Number.

How to connect MSK to AWS VPC?

MSK is Kafka. You need an Apache Kafka C++ client, and similar to kinesis above you will need some sort of tunnel or gateway from your on-prem network to the AWS vpc where you have provisioned MSK It's possible, but it's unlikely there are any turn-key solutions for this.

How does AWS kinesis work with MSK?

There are two AWS services to choose from: Both services are publish-subscribe (pub-sub) systems, which means producers publish messages to Kinesis/MSK and consumers subscribe to Kinesis/MSK to read those messages. An inherent benefit of adopting pub-sub systems is the decoupling of message producers from message consumers.


Video Answer


1 Answers

To be blunt, your use case sounds simple and 50 messages a second is a very low rate.

Kinesis is a firehose where you need a straw. Kinesis is meant to ingest, transform and process terabytes of moving data. ]

Have you considered rather looking at SQS or Amazon MQ ? Both are considerably simpler to use and manage than Kafka or Kinesis. Just from your questions it's clear you have not interacted with Kafka at all, so you're going to have a steep learning curve. SQS is a simple api-based queueing system - you publish to an SQS queue, and you consume from the queue. If you don't need to worry about ordering, routing, etc it is a persistent and reliable (if clunky) technology that lots of people use to great success.

To answer your actual questions:

  1. Amazon publishes a C++ SDK for their services - I would be stunned if there wasn't a Kinesis client as part of this. You would either need a public Kinesis endpoint, or a private Kinesis endpoint accessible via some sort of tunnel or gateway between your on-prem network and your AWS vpc.

  2. MSK is Kafka. You need an Apache Kafka C++ client, and similar to kinesis above you will need some sort of tunnel or gateway from your on-prem network to the AWS vpc where you have provisioned MSK

  3. It's possible, but it's unlikely there are any turn-key solutions for this. You will have to write some sort of bridging software from Kafka -> Other systems

  4. You can possibly use Lambda, so long as you cater for failures, timeouts, and other failure modes. To be honest, a stand-alone consumer running as a service in your vpc or on-prem is a better idea.

  5. SQS or Amazon MQ as previously mentioned are likely to be simpler and more cost-effective than MSK, and will almost certainly be cheaper than Kinesis.

like image 116
mcfinnigan Avatar answered Oct 18 '22 22:10

mcfinnigan