What is difference between Oozie workflow, coordinator and bundle ?
Oozie workflow defines a sequence of actions. And we need to invoke it manually every time we want it to run. Where as same workflow can be scheduled through coordinator. Is this understanding correct ?
Then what is extra in bundle ?
I guess it is used again to schedule set of coordinators. Then why can't one coordinator be used to schedule other coordinator like one workflow can have another sub-workflow.
The Oozie Bundle system allows the user to define and execute a bunch of coordinator applications often called a data pipeline. There is no explicit dependency among the coordinator applications in a bundle.
The Oozie Coordinator system allows the user to define and execute recurrent and interdependent workflow jobs (data application pipelines). Real world data application pipelines have to account for reprocessing, late processing, catchup, partial processing, monitoring, notification and SLAs.
What are the different states of an Apache Oozie Workflow job? An Apache Oozie Workflow job can have the following states - PREP , RUNNING , SUSPENDED , SUCCEEDED , KILLED and FAILED.
Java action, shell action, MapReduce action, Hive action, Pig action are some of the workflow actions which can be scheduled and executed by the Apache Oozie scheduler system. One can also specify the condition for a job to run.
Workflow:
It is a sequence of actions. It is written in xml and the actions can be map reduce, hive, pig etc.
Coordinator:
It is a program that triggers actions (commonly workflow jobs) when a set of conditions are met. Conditions can be a time frequency,other external events etc.
Bundle:
It is defined as a higher level oozie abstraction that batches a set of coordinator jobs.We can specify the time for bundle job to start as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With