I am using AWS-Kinesis-Firehose to injest data to S3, and consume it afterwards with Athena.
I am trying to analyze events from different games, to avoid Athena explore much data I would like to partition the s3 data using an identifier for each game, so far I did not find a solution, as Firehose receives data from different games.
Does anyone knows how to do it?
Thank you, Javi.
Dynamic partitioning enables you to continuously partition streaming data in Kinesis Data Firehose by using keys within data (for example, customer_id or transaction_id ) and then deliver the data grouped by these keys into corresponding Amazon Simple Storage Service (Amazon S3) prefixes.
A Kinesis Data Analytics cannot write data to Amazon S3 with server-side encryption enabled on Kinesis Data Analytics. You can create the Kinesis stream and Amazon S3 bucket using the console.
Whether you have one stream or many streams, this solution can handle scaling for you. The benefits include less operational overhead, increased throughput, and reduced costs. Everything you need to get started is available on the Kinesis Auto Scaling GitHub repo.
In addition, Kinesis Data Streams synchronously replicates data across three Availability Zones, providing high availability and data durability.
You could possibly use Amazon Kinesis Analytics to split incoming Firehose streams into separate output streams based upon some logic, such as Game ID.
It can accept a KinesisFirehoseInput and send data to a KinesisFirehoseOutput.
However, the limits documentation seems to suggest that there can only be 3 output destinations per application, so this would not be sufficient.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With