There are a lot of questions comparing Flink vs Spark Streaming, Flink vs Storm and Storm vs Heron.
The origin of this question is from the fact that both Apache Flink and Twitter Heron are true stream processing frameworks (not micro-batch, like Spark Streaming). Storm has been decommissioned by Twitter last year and they're using Heron instead (which is basically Storm reworked).
There are nice presentations by Slim Baltagi on Flink and Flink vs Spark: https://www.youtube.com/watch?v=G77m6Ou_kFA
Nice research by Ilya Ganelin on various streaming frameworks: https://www.youtube.com/watch?v=KkjhyBLupvs
Pretty interesting thoughts on Flink vs Storm: What is/are the main difference(s) between Flink and Storm?
But I haven't seen any comparison of new Storm/Heron vs Apache Flink.
Both of the projects are pretty young, both support using previously written Storm applications and many other things. Flink is more fitting into Hadoop ecosystem, Heron is more into Twitter based ecosystem stack.
Any thoughts?
Netflix engineers recently published how they built Studio Search, using Apache Kafka streams, an Apache Flink-based Data Mesh process, and an Elasticsearch sink to manage the index.
Apache Flink is an excellent choice to develop and run many different types of applications due to its extensive features set. Flink's features include support for stream and batch processing, sophisticated state management, event-time processing semantics, and exactly-once consistency guarantees for state.
Flink's low latency outperforms Spark consistently, even at higher throughput. Spark can achieve low latency with lower throughput, but increasing the throughput will also increase the latency.
Flink capabilities enable real-time insights from streaming data and event-based capabilities. Flink enables real-time data analytics on streaming data and fits well for continuous Extract-transform-load (ETL) pipelines on streaming data and for event-driven applications as well.
All of the points in the referenced article comparing Apache Flink and Apache Storm also apply to Twitter's Heron. Heron provides exactly the same type of semantics and functionality as Storm. Heron is really best understood simply as a re-implementation of Storm that better fits Twitter's operational requirements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With