For Azure event hub 1 though put unit equals 1MB/sec ingress. So it can take 1000 messages of 1 KB. If I select 5 or more throughput units would I be able to ingest 5000 messages/ second of 1KB size with 4 partitions? What would be egress in that case? I am not sure about limitation on Event Hub partition, i read that it is also 1MB/sec. But then does that mean to use event hub effectively i need to have same number of partitions?
The throughput capacity of Event Hubs is controlled by throughput units. Throughput units are pre-purchased units of capacity. A single throughput unit lets you: Ingress: Up to 1 MB per second or 1000 events per second (whichever comes first). Egress: Up to 2 MB per second or 4096 events per second.
The number of partitions specified at the event hub creation time is static and it cannot be modified later (so care should be taken). Using the standard Azure portal you can create between 2 to 32 partitions (default 4), however, if required, you can create up to 1024 partitions by contacting Azure support.
It is processed through a static hashing function, which creates the partition assignment. If you don't specify a partition key when publishing an event, a round-robin assignment is used.
Great question.
A few basics.
1 Throughput Unit (TU) means an ingress limit of 1 MB/sec or 1000 msgs/sec - whichever happens first. You pay for TUs and you can change TUs as per your load requirements. This is your knob to control the bill. And TUs are set on a given Event Hubs Namespace!
When you buy 1 TU for an EventHubs Namespace and create a number of EventHubs in it, the the limit of 1 MB/sec or 1000 msgs/sec applies cumulatively across them. The limit also applies to each partition individually. Although, sometimes you might get lucky in some regions where load is low.
Consider these principles while deciding on no. of partitions in eventhub for your service:
Another thing to note is, a TU is configured at namespace level. And, one Event Hubs namespace can have multiple EventHubs in it and each EventHub can have a different no. of partitions.
Answers:
If you select 5 or more TUs on the Namespace and have only 1 EventHub with 4 partitions you will get a max. of 4 MB/sec or 4K msgs/sec.
Egress max will be 2X of ingress (8 MBPS or 8K msgs/sec). In other words, you could create 2 patterns of receives (e.g. slow and fast) by creating 2 consumer groups. If you need more than 2X parallel receives then you will need to by more TUs.
Yes, ideally you will need more partitions than TUs. First model your partition count as mentioned above. Start with 1 TU while you are developing your solution. Once done, when you are doing load testing or going live, increase TUs in tune with your load. Remember, you could have multiple EventHubs in a Namespace. So, having 20 TUs at Namespace level and 10 EventHubs with 4 partitions each can deliver 20 MB/sec across the Namespace.
More on EventHubs
One partition goes to one TPU. Think of TPUs as a processing engine. You can't take advantage of more TPUs than you have partitions. If you have 4 partitions, you can't use more than 4 TPUs.
It's typical to have more partitions than TPUs, for the following reasons
As for throughput, the limits are 1 MB ingerss/2 MB egress per TPU. This covers the typical scenario where each event is sent both to cold storage (eg a database) and Stream analytics or an event processor for analysis, monitoring etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With