Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simplest way to log all messages from an Azure Event Hub

I'm using a service which outputs to an Event Hub.

We want to store that output, to be read once per day by a batch job running on Apache Spark. Basically we figured, just get all messages dumped to blobs.

What's the easiest way to capture messages from an Event Hub to Blob Storage?

Our first thought was a Streaming Analytics job, but it demands to parse the raw message (CSV/JSON/Avro), our current format is none of those.


Update We solved this problem by changing our message format. I'd still like to know if there's any low-impact way to store messages to blobs. Did EventHub have a solution for this before Streaming Analytics arrived?

like image 259
Iain Avatar asked Aug 18 '15 03:08

Iain


1 Answers

You could write your own worker process to read the messages off EventHub and store them to blob storage. You do not need to do this real time as messages on EH remain for the set retention days. The client that reads the EH is responsible for managing what messages have been processed by keeping track of the EH message partitionid and offset. There is a C# library that makes this extremely easy and scales really well: https://azure.microsoft.com/en-us/documentation/articles/event-hubs-csharp-ephcs-getstarted/

like image 163
Tim Benroeck Avatar answered Sep 22 '22 21:09

Tim Benroeck