Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how should I use the right owner task in airflow?

Tags:

airflow

owner

I dont understand the "owner" in airflow. the comment of ower is "the owner of the task, using the unix username is recommended". I wrote some the following code.

   Default_args = {
'owner': 'max',
'depends_on_past': False,
'start_date': datetime(2016, 7, 14),
'email': ['[email protected]'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),  

}

dag = DAG('dmp-annalect', default_args=default_args,
                schedule_interval='30 0 * * *')

pigjob_basedata = """{local_dir}/src/basedata/basedata.sh > 
{local_dir}/log/basedata/run_log &
""".format(local_dir=WORKSPACE)

task1_pigjob_basedata = BashOperator(
task_id='task1_pigjob_basedata',owner='max',
bash_command=pigjob_basedata ,
dag=dag)

But I used the command "airflow test dagid taskid 2016-07-20" , I got some error, ... {bash_operator.py:77} INFO - put: Permission denied: user=airflow, ....

I thought that my job ran with "max" user, but apperently , ran test using 'airflow' user .

I hope if I run my task using 'max' user, how should I do.

like image 666
Max.H Avatar asked Jul 22 '16 07:07

Max.H


2 Answers

I've mitigated this by adding user airflow and all other users who own tasks into a group, then giving the entire group permission to read, write and execute files within airflow's home. Not sure if this is best practice, but it works and makes the owner field more useful than setting airflow as the owner of every DAG.

like image 95
kadu Avatar answered Oct 18 '22 02:10

kadu


I figured out this issue. Because I set the AIRFLOW_HOME in /home/airflow/, only airflow can access this file directory.

like image 4
Max.H Avatar answered Oct 18 '22 01:10

Max.H