I've been learning about Event Hubs and just want to get confirmation or correction on my perspective on Event Hubs? I’m used to leveraging retries, poison messages, at least once delivery and so on for normal enterprise messaging solutions, which Azure Service Bus Queues and Topics give me. It seems that Event Hubs is intended to provide a different tool for very high scale where you have to give up a little of the more “enterprise” features for much higher scale.
Am I thinking about this correctly? Are there additional specifics I need to consider as well? I realize there could be some functional overlap with Event Hubs and Topics, but I'm just looking to get some clarity on how to think of using Event Hubs.
Service Bus is used as the backbone to connects applications running in the cloud to other applications or services and transfers data between them whereas Event Hubs is more concerned about receiving massive volume of data with high throughout and low latency.
Azure Event Hubs is a big data streaming platform and event ingestion service. It can receive and process millions of events per second. Data sent to an event hub can be transformed and stored by using any real-time analytics provider or batching/storage adapters.
The noticeable difference between them is that Event Hubs are accepting only endpoints for the ingestion of data and they don't provide a mechanism for sending data back to publishers. On the other hand, Event Grid sends HTTP requests to notify events that happen in publishers.
An Event Hubs namespace is a management container for event hubs (or topics, in Kafka parlance). It provides DNS-integrated network endpoints and a range of access control and network integration management features such as IP filtering, virtual network service endpoint, and Private Link.
If you have the choice it's almost always easier to write a system based around a full enterprise pubsub messaging system where you can mark single events as having been consumed, retry messages, and just about every other wonderful feature. If you've already accepted partitioning your message channel (which Azure Service Bus Topics appear to support) then you could in principle scale a more full featured messaging system to the degree you require. The issue is at what cost?
An Azure Service Bus Topic has a cost at high scale of approximately $0.20 per Million messages, Amazon SQS (somewhat similar) lists $0.50 per Million. If you host it yourself you'll likely need to set up a lot of RabbitMQ servers or even multiple clusters as you partition.
Azure Event Hub costs $0.028 per Million plus an amount per throughput unit, same for Amazon Kinesis. Apache Kafka has been benchmarked at 2 Million per second on 3 machines
At say 20,000 events per second sustained the difference between some Azure Topics and Azure Event Hub is in the range of a full time developer's salary. At 2 million per second sustained (which requires contacting MS), the difference is approaching $1M/month.
Basically use the partitioned stream|log / offset tracking systems when you either don't need all the useful features of a full messaging system, or when you don't need them enough to pay the ~10X premium. (Or can't use them because you can't scale the proper messaging system enough without heroic efforts).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With