Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why not always configure for max number of event hub partitions?

The Azure Event Hubs overview article states the following:

The number of partitions is specified at the Event Hub creation time and must be between 8 and 32. Partitions are a data organization mechanism and are more related to the degree of downstream parallelism required in consuming applications than to Event Hubs throughput. This makes the choice of the number of partitions in an Event Hub directly related to the number of concurrent readers you expect to have. After Event Hub creation, the partition count is not changeable; you should consider this number in terms of long-term expected scale. You can increase the 32 partition limit by contacting the Azure Service Bus team.

Since you cannot change the number of partitions on your event hub after initial creation, why not just always configure it to the maximum number of partitions, 32? I do not see any pricing implications in doing this. Is there some performance trade off?

Also, as another side note, I appear to be able to create an event hub with less than 8 partitions. The article says it must be between 8-32. Not sure why it says that...

like image 520
kspearrin Avatar asked Aug 12 '15 17:08

kspearrin


People also ask

How many partitions should I have event hub?

It must be between 1 and the maximum partition count allowed for each pricing tier. For the partition count limit for each tier, see this article. We recommend that you choose at least as many partitions as you expect that are required during the peak load of your application for that particular event hub.

What is the maximum size of an event in Azure event hub using the basic plan?

The maximum message size allowed for Event Hubs is 1 MB.

How do I increase partition count in event hub?

Use the AlterTopics API (for example, via kafka-topics CLI tool) to increase the partition count. For details, see Modifying Kafka topics.


1 Answers

Its my understanding that each partition requires its own consumer. You could do this via multi-threading on a single process, multiple processes, or even via multipage machines each running a process. But this comes with a degree of complexity. Either the management of all the processes to ensure that all partitions are being consumed, or for synchronizing items/events that span partitions.

So the implicates are less about pricing then they are about scalability/complexity. :)

like image 149
BrentDaCodeMonkey Avatar answered Oct 05 '22 23:10

BrentDaCodeMonkey