More specifically, what usecases does Hazelcast Jet solve that Flink does not solve (equally well) and vice versa?

NOTE: I belong to Hazelcast Jet's core engineering team. <hr> I'd say the main advantage of Hazelcast Jet isn't in offering a brand-new computing model, but in bringing the same level of convenience that Hazelcast is known for to the realm of DAG-based distributed computing. If you currently have a Java application running in a cluster, adding Jet will be a snap: add the Maven dependency and write one line of code to start a Jet instance on the local member. The instances will self-discover to form their own cluster, and you can now submit your job to it. If you want a dedicated distributed computing cluster, you can download the distribution ZIP to the cluster machines. Jet has native support for the most popular cloud environments, allowing the nodes you start to self-discover. You can then connect to the cluster using a Jet client. Needless to say, Jet makes it very convenient to use a Hazelcast <code>IMap</code> or <code>IList</code> as a data source. Jet cluster can host Hazelcast structures directly; then you benefit from data locality and get the data with no network traffic. On the other hand, the choice of data source is completely unconstrained and there is public API dedicated to implementing fast, arbitrarily partitioned, custom data sources. Jet solves the concerns of infinite streams processing like aggregating over time-based windows, dealing with reordered events and resilience to changes in the cluster topology (e.g., failure of individual Jet nodes) while maintaining the Exactly-Once processing guarantee. Jet's main programming paradigm is the Pipeline API which is quite similar to <code>java.util.stream</code> API but adapted to the specifics of distributed computing (lambda serialization and other concerns). Pipeline API builds upon a lower-level DAG-based model that is also exposed as public API.

What are the differences between Hazelcast Jet and Apache Flink

1 Answers

_{NOTE: I belong to Hazelcast Jet's core engineering team.}

I'd say the main advantage of Hazelcast Jet isn't in offering a brand-new computing model, but in bringing the same level of convenience that Hazelcast is known for to the realm of DAG-based distributed computing.

If you currently have a Java application running in a cluster, adding Jet will be a snap: add the Maven dependency and write one line of code to start a Jet instance on the local member. The instances will self-discover to form their own cluster, and you can now submit your job to it.

If you want a dedicated distributed computing cluster, you can download the distribution ZIP to the cluster machines. Jet has native support for the most popular cloud environments, allowing the nodes you start to self-discover. You can then connect to the cluster using a Jet client.

Needless to say, Jet makes it very convenient to use a Hazelcast IMap or IList as a data source. Jet cluster can host Hazelcast structures directly; then you benefit from data locality and get the data with no network traffic. On the other hand, the choice of data source is completely unconstrained and there is public API dedicated to implementing fast, arbitrarily partitioned, custom data sources.

Jet solves the concerns of infinite streams processing like aggregating over time-based windows, dealing with reordered events and resilience to changes in the cluster topology (e.g., failure of individual Jet nodes) while maintaining the Exactly-Once processing guarantee.

Jet's main programming paradigm is the Pipeline API which is quite similar to java.util.stream API but adapted to the specifics of distributed computing (lambda serialization and other concerns).

Pipeline API builds upon a lower-level DAG-based model that is also exposed as public API.

186

answered Sep 18 '22 01:09

Marko Topolnik

Related questions
                            
                                What is the difference between periodic and punctuated watermarks in Apache Flink?
                            
                                Confused about FLINK task slot
                            
                                Flink: DataSource's outputs caused an error: Could not read the user code wrapper
                            
                                Apache Flink vs Twitter Heron?
                            
                                How to handle errors in custom MapFunction correctly?
                            
                                How to stop a flink streaming job from program
                            
                                Apache Flink: How to apply multiple counting window functions?
                            
                                Degree of parallelism in Apache Flink
                            
                                Global sorting in Apache Flink
                            
                                Apache Flink - Send event if no data was received for x minutes
                            
                                How do Apache Flink's JoinFunction and CoGroupFunction differ?
                            
                                flink job is not distributed across machines
                            
                                Flink: No operators defined in streaming topology. Cannot execute
                            
                                How to See Log or Sysout in Flink Standalone
                            
                                Kafka -> Flink DataStream -> MongoDB
                            
                                Flink: How to handle external app configuration changes in flink
                            
                                How to build and use flink-connector-kinesis?
                            
                                What does "streaming" mean in Apache Spark and Apache Flink?
                            
                                Flink streaming event time window ordering
                            
                                Apache Flume vs Apache Flink difference

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What are the differences between Hazelcast Jet and Apache Flink

Tags:

apache-flink

hazelcast-jet

Atle

People also ask

1 Answers

Marko Topolnik

Recent Activity

Donate For Us