I use fork/join in Oozie, in order to parallel some sub-workflow actions. My workflow.xml looks like this:
<workflow-app name="myName" xmlns="uri:oozie:workflow:0.5"
<start to="fork1"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<fork name="fork1">
<path start="subworkflow1"/>
<path start="subworkflow2"/>
</fork>
<join name="Completed" to="End"
<action name="subworkflow1">
<sub-workflow>
<app-path>....</app-path>
<propagate-configuration/>
<configuration>
<property>
<name>....</name>
<value>....</value>
</property>
</configuration>
</sub-workflow>
<ok to="Completed"/>
<error to="Completed"/>
</action>
<action name="subworkflow2">
<sub-workflow>
<app-path>....</app-path>
<propagate-configuration/>
<configuration>
<property>
<name>....</name>
<value>....</value>
</property>
</configuration>
</sub-workflow>
<ok to="Completed"/>
<error to="Completed"/>
</action>
<end name="End"></workflow-app>
When subworkflow1 is killed (failed for some reason), It kills subworkflow2 also. I want those two actions to be parallel, but not dependent.
In my workflow, when workflow1 is killed, I see that workflow2 is also killed, but my app succeeded (I check it on Oozie dashboard -> workflows in HUE).
In this case I want that subworkflow1 will be killed, subworkflow2 will succeed, and I don't really care what my entire app will say.
What should I do to make each path to get it's own status and continue running even though other path in the same fork is killed?
A fork node splits one path of execution into multiple concurrent paths of execution. A join node waits until every concurrent execution path of a previous fork node arrives to it. The fork and join nodes must be used in pairs.
For each fork, there should be a join. As Join assumes all the node are a child of a single fork. We also use fork and join for running multiple independent jobs for proper utilization of the cluster.
I have recently run into this issue also. Found a way to get oozie to behave how I want.
Your forked actions can have an error-to value equal to your join name. This will skip any subsequent action in that particular forked execution path. Then, your join's "to" value can send control to a decision node. That decision node should check value of wf:lastErrorNode()
. If the value is empty string, continue on processing the workflow as needed. If the value is not empty string, then an error occurred and your can send control to kill node.
Here's an example:
<start to="forkMe"/>
<fork name="forkMe">
<path start="action1"/>
<path start="action2"/>
</fork>
<action name="action1">
...
<ok to="joinMe"/>
<error to="joinMe"/>
</action>
<action name="action1">
...
<ok to="joinMe"/>
<error to="joinMe"/>
</action>
<join name="joinMe" to="decisionMe"/>
<decision name="decisionMe">
<switch>
<case to="end">
${wf:lastErrorNode() eq ""}
</case>
<default to="error-mail"/>
</switch>
</decision>
<action name="error-mail">
...
<ok to="fail"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Job failed:
message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name="end"/>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With