Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Flink vs Twitter Heron?

There are a lot of questions comparing Flink vs Spark Streaming, Flink vs Storm and Storm vs Heron.

The origin of this question is from the fact that both Apache Flink and Twitter Heron are true stream processing frameworks (not micro-batch, like Spark Streaming). Storm has been decommissioned by Twitter last year and they're using Heron instead (which is basically Storm reworked).

There are nice presentations by Slim Baltagi on Flink and Flink vs Spark: https://www.youtube.com/watch?v=G77m6Ou_kFA

Nice research by Ilya Ganelin on various streaming frameworks: https://www.youtube.com/watch?v=KkjhyBLupvs

Pretty interesting thoughts on Flink vs Storm: What is/are the main difference(s) between Flink and Storm?

But I haven't seen any comparison of new Storm/Heron vs Apache Flink.

Both of the projects are pretty young, both support using previously written Storm applications and many other things. Flink is more fitting into Hadoop ecosystem, Heron is more into Twitter based ecosystem stack.

Any thoughts?

like image 906
experimenter Avatar asked Jun 04 '16 22:06

experimenter


People also ask

Does Netflix use Flink?

Netflix engineers recently published how they built Studio Search, using Apache Kafka streams, an Apache Flink-based Data Mesh process, and an Elasticsearch sink to manage the index.

Is Apache Flink good?

Apache Flink is an excellent choice to develop and run many different types of applications due to its extensive features set. Flink's features include support for stream and batch processing, sophisticated state management, event-time processing semantics, and exactly-once consistency guarantees for state.

Is Flink better than spark?

Flink's low latency outperforms Spark consistently, even at higher throughput. Spark can achieve low latency with lower throughput, but increasing the throughput will also increase the latency.

Is Flink a real-time streaming system?

Flink capabilities enable real-time insights from streaming data and event-based capabilities. Flink enables real-time data analytics on streaming data and fits well for continuous Extract-transform-load (ETL) pipelines on streaming data and for event-driven applications as well.


1 Answers

All of the points in the referenced article comparing Apache Flink and Apache Storm also apply to Twitter's Heron. Heron provides exactly the same type of semantics and functionality as Storm. Heron is really best understood simply as a re-implementation of Storm that better fits Twitter's operational requirements.

like image 198
Jamie Grier Avatar answered Sep 27 '22 23:09

Jamie Grier