Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run single application master for oozie workflow

(according to Why does the oozie luncher consume 2 yarn containers?)

I have cluster with 1900 core and 11TB RAM. And I have next structure of workflow for my Oozie WF:

  • Approximately 300-400 subworkflows with same structure that will run in parallel (by fork control node)
  • In these subflows one-by-one run several tasks (java actions, spark tasks, shell actions)
  • Some of subflows can execute in 3-5 minutes, some of them - 2-3 hours (long term spark tasks)

The question is - is it possible to run these subworkflows in a single container (application master)? By default, for each subworkflow oozie/yarn uses two cores: one for AM and one for map-reduce task (controller). And this is the bottleneck - 1/3 of all cores of my cluster used only for controlling but not for computing

like image 758
Evgeny Lazarev Avatar asked Oct 17 '22 17:10

Evgeny Lazarev


1 Answers

I guess you can use the uber mode of the oozie to save the container which launches the oozie action job. The AM will launch the action instead of doing it from a separate container.

Add the following property into oozie-site.xml.

<property>
  <name>oozie.action.launcher.mapreduce.job.ubertask.enable</name>
  <value>true</value>
</property>
like image 132
YoungHobbit Avatar answered Oct 21 '22 09:10

YoungHobbit