Amazon claims their Kinesis streaming product guarantees record ordering.
It provides ordering of records, as well as the ability to read and/or replay records in the same order (...)
Kinesis is composed of Streams that are themselves composed of one or more Shards. Records are stored in these Shards. We can write consumer applications that connect to a Shard and read/replay records in the order they were stored.
But can Kinesis guarantee, out of the box, ordering for the Stream itself without pushing ordering logic to the consumers? How can a consumer read records from multiple Shards of the same Stream, making sure the records are read in the same order they were added to the Stream?
It seems this is not possible to achieve. Ordering is guaranteed on a shard level, but not across the all stream.
https://brandur.org/kinesis-order
So back to our original question: how can we guarantee that all records are consumed in the same order in which they’re produced? The answer is that we can’t, but that we shouldn’t let that unfortunate reality bother us too much. Once we’ve scaled our stream to multiple shards, there’s no mechanism that we can use to guarantee that records are consumed in order across the whole stream; only within a single shard.
If you need guaranteed order of all data in the stream you can only have one shard. That, of course, doesn't scale very well. What you need to determine is whether you really need that level of ordered data. Is all the data in the stream related to all the other data? The key is to put data in shards when the data is related. Use multiple shards to allow your data to be processed in parallel. If all related data is together in one shard you can take advantage of the guaranteed ordering. If you really need all the data to be ordered you're just going to have to deal with the limited scaling that necessarily comes with that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With