Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Workflow tool comaparison: Oozie Vs Cascading

I am looking for a workflow tool to run complex map-reduce jobs. I have Oozie in mind but also want to explore Cascading. Is there any sample code or example that chains existing M/R jobs using cascading API? Also, can you provide the comparison Oozie Vs Cascading?

like image 714
user1499636 Avatar asked Jul 03 '12 18:07

user1499636


People also ask

Is airflow better than Oozie?

The Airflow UI is much better than Hue (Oozie UI),for example: Airflow UI has a Tree view to track task failures unlike Hue, which tracks only job failure. The Airflow UI also lets you view your workflow code, which the Hue UI does not.

What is difference between Oozie and airflow?

Oozie additionally supports subworkflow and allows workflow node properties to be parameterized and dynamically evaluated using EL function. In contrast, Airflow is a generic workflow orchestration for programmatically authoring, scheduling, and monitoring workflows.

What is Oozie used for?

Apache Oozie is a tool for Hadoop operations that allows cluster administrators to build complex data transformations out of multiple component tasks. This provides greater control over jobs and also makes it easier to repeat those jobs at predetermined intervals.


1 Answers

Cascading and Oozie are not in the same category.

Oozie is a workflow scheduler.

Cascading is an API for creating workflows. It is agnostic about schedulers, i.e., it should run with whatever scheduler system that you use.

There is perhaps some confusion because the Oozie docs mention a "DAG", and both run atop Hadoop.

Also, Cascading has a notion of "data availability" in the checkpoint support, which is supported in Oozie, albeit differently.

like image 52
Paco Avatar answered Nov 15 '22 09:11

Paco