Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oozie shell action: exec and file tags

Tags:

oozie

I'm a newbie in Oozie and I've read some Oozie shell action examples but this got me confused about certain things.

There are examples I've seen where there is no <file> tag.

Some example, like in Cloudera here, repeats the shell script in file tag:

<shell xmlns="uri:oozie:shell-action:0.2">
    <exec>check-hour.sh</exec>
    <argument>${earthquakeMinThreshold}</argument>
    <file>check-hour.sh</file>
</shell>

While in Oozie's website, writes the shell script (the reference ${EXEC} from job.properties, which points to script.sh file) twice, separated by #.

<shell xmlns="uri:oozie:shell-action:0.1">
    ...
    <exec>${EXEC}</exec>
    <argument>A</argument>
    <argument>B</argument>
    <file>${EXEC}#${EXEC}</file>
</shell>

There are also examples I've seen where the path (HDFS or local?) is prepended before the script.sh#script.sh within the <file> tag.

<shell xmlns="uri:oozie:shell-action:0.1">
    ...
    <exec>script.sh</exec>
    <argument>A</argument>
    <argument>B</argument>
    <file>/path/script.sh#script.sh</file>
</shell>

As I understand, any shell script file can be included in the workflow HDFS path (same path where workflow.xml resides).

Can someone explain the differences in these examples and how <exec>, <file>, script.sh#script.sh, and the /path/script.sh#script.sh are used?

like image 736
oikonomiyaki Avatar asked Jan 27 '16 07:01

oikonomiyaki


1 Answers

<file>hdfs:///apps/duh/mystuff/check-hour.sh</file> means "download that HDFS file into the Current Working Dir of the YARN container that runs the Oozie Launcher for the Shell action, using the same file name by default, so that I can reference it as ./check-hour.sh or simply check-hour.sh in the <exec> element".

<file>check-hour.sh</file> means "download that HDFS file -- from my user's home dir e.g. hdfs:///user/borat/check-hour.sh -- into etc. etc.".

<file>hdfs:///apps/duh/mystuff/check-hour.sh#youpi</file> means "download that HDFS file etc. etc., renaming it as youpi, so that I can reference it as ./youpi or simply youpi in the element".

Note that the Hue UI often inserts unnecessary # stuff with no actual name change. That's why you will see it so often.

like image 167
Samson Scharfrichter Avatar answered Oct 04 '22 02:10

Samson Scharfrichter