I dont understand the "owner" in airflow. the comment of ower is "the owner of the task, using the unix username is recommended". I wrote some the following code.
Default_args = {
'owner': 'max',
'depends_on_past': False,
'start_date': datetime(2016, 7, 14),
'email': ['[email protected]'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
}
dag = DAG('dmp-annalect', default_args=default_args,
schedule_interval='30 0 * * *')
pigjob_basedata = """{local_dir}/src/basedata/basedata.sh >
{local_dir}/log/basedata/run_log &
""".format(local_dir=WORKSPACE)
task1_pigjob_basedata = BashOperator(
task_id='task1_pigjob_basedata',owner='max',
bash_command=pigjob_basedata ,
dag=dag)
But I used the command "airflow test dagid taskid 2016-07-20" , I got some error, ... {bash_operator.py:77} INFO - put: Permission denied: user=airflow, ....
I thought that my job ran with "max" user, but apperently , ran test using 'airflow' user .
I hope if I run my task using 'max' user, how should I do.
I've mitigated this by adding user airflow
and all other users who own tasks into a group, then giving the entire group permission to read, write and execute files within airflow
's home. Not sure if this is best practice, but it works and makes the owner
field more useful than setting airflow
as the owner
of every DAG.
I figured out this issue. Because I set the AIRFLOW_HOME in /home/airflow/, only airflow can access this file directory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With