As both are streaming frameworks which processes event at a time, What are the core architectural differences between these two technologies/streaming framework? Also, what are some particular use cases where one is more appropriate than the other?

As you mentioned both are streaming platform which to in memory computation in real time. But there are some architectural differences when you take a closer look. <ol> <li>Apex is yarn native architecture, it fully utilises yarn for scheduling, security & multi-tenancy where as Flink integrates with yarn. Apex can do resource allocation at operator (container) level with yarn.</li> <li>Partitioning: Apex supports several sophisticated stream partitioning schemes and also allows controlling operator locality & stream locality. Flink supports simple hash partitions and custom partitions.</li> <li>Apex allows dynamic changes to topology without having to take down the application. Apex allows the application to be updated at runtime so you can add and remove operators, update properties of operators, or automatically scale the application at runtime. Apache Flink does not support any of these capabilities.</li> <li>Buffer Server: There is a message bus called buffer server between operators. Subscribers can connect to buffer server and fetch data from particular offsets. This is window aware, and holds data as long as no subscriber needs it.</li> <li>Fault tolerance: Apex has incremental recovery model, on failure it can only part of topology can be restarted no need to go back to source, where in flink it goes back to source.</li> <li>Apex has high level api as well as low level api. Flink only has high level api.</li> <li>Apex has a library called Apache Malhar which has vast variety of well tested connectors and processing operators which can be reused easily.</li> <li>Lastly Apex is more focused on productizing big data applications so has many features which will help in easy development and maintenance of applications.</li> </ol> Note: I am a committer to Apache Apex, so I might sound biased to Apex :)

Apache Apex vs Apache Flink

1 Answers

As you mentioned both are streaming platform which to in memory computation in real time. But there are some architectural differences when you take a closer look.

Apex is yarn native architecture, it fully utilises yarn for scheduling, security & multi-tenancy where as Flink integrates with yarn. Apex can do resource allocation at operator (container) level with yarn.
Partitioning: Apex supports several sophisticated stream partitioning schemes and also allows controlling operator locality & stream locality. Flink supports simple hash partitions and custom partitions.
Apex allows dynamic changes to topology without having to take down the application. Apex allows the application to be updated at runtime so you can add and remove operators, update properties of operators, or automatically scale the application at runtime. Apache Flink does not support any of these capabilities.
Buffer Server: There is a message bus called buffer server between operators. Subscribers can connect to buffer server and fetch data from particular offsets. This is window aware, and holds data as long as no subscriber needs it.
Fault tolerance: Apex has incremental recovery model, on failure it can only part of topology can be restarted no need to go back to source, where in flink it goes back to source.
Apex has high level api as well as low level api. Flink only has high level api.
Apex has a library called Apache Malhar which has vast variety of well tested connectors and processing operators which can be reused easily.
Lastly Apex is more focused on productizing big data applications so has many features which will help in easy development and maintenance of applications.

Note: I am a committer to Apache Apex, so I might sound biased to Apex :)

107

answered Oct 29 '22 19:10

priya

Related questions
                            
                                Apache Flink Rest-Client Jar-Upload not working
                            
                                Apache Flink: ClassNotFoundException on remote cluster
                            
                                java.lang.ClassNotFoundException: com.fasterxml.jackson.databind.ser.FilterProvider when flink boot up
                            
                                apache flink - the correct way of error handling
                            
                                Integration - Apache Flink + Spring Boot
                            
                                Iterator behaviour in flink reduceGroup
                            
                                Flink Windows Boundaries, Watermark, Event Timestamp & Processing Time
                            
                                Flink job with CassandrSink fails with Error writing
                            
                                flink kafka consumer groupId not working
                            
                                Can I use Flink state to perform join?
                            
                                Get file name of DataStream with Flink
                            
                                How does Apache Flink implement iteration?
                            
                                How Apache Flink deal with skewed data?
                            
                                Apache Flink: Why to choose the MemoryStateBackend over the FsStateBackend?
                            
                                TensorFlow Extended (TFX): Clarify Beam, Airflow and Kubeflow usage
                            
                                How to avoid repeated tuples in Flink slide window join?
                            
                                Apache flink on Kubernetes - Resume job if jobmanager crashes
                            
                                What is the difference between Flink join and connect?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Apache Apex vs Apache Flink

Tags:

apache-flink

stream-processing

apache-apex

Biplob Biswas

People also ask

1 Answers

priya

Recent Activity

Donate For Us