I start locally on my virtual machine Apache Airflow and i want to connect to Amazon Glue jobs to run them. Source code i got from pull-request: https://github.com/apache/incubator-airflow/pull/3504/files
So what connections(in Airflow UI) should i establish to run the Amazon Glue jobs? Can you advise me some documentation? Because i haven't found anything helpful in official docs.
For dag i use simple code:
from datetime import datetime
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.operators.aws_glue_operator import AWSGlueJobOperator
def print_hello():
return 'Hello hello!'
dag = DAG('hello_world', description='Simple glue DAG',
schedule_interval='0 0 * * *',
start_date=datetime(2018, 6, 28), catchup=False)
awsGlueOperator = AWSGlueJobOperator(job_name='FIRST_JOB', script_location='https://s3.us-east-2.amazonaws.com/path-to-script',s3_bucket='https://s3.console.aws.amazon.com/s3/', iam_role_name='AWSGlueServiceRole', dag=dag)
hello_operator = PythonOperator(task_id='hello_task', python_callable=print_hello, dag=dag)
awsGlueOperator >> hello_operator
Thank you in advance.
It looks like the GlueOperator you are using uses the AWS Hook. Jumping into the source code for that shows that aws keys and such can go in the extras field as a JSON object.
So you can probably just use the Amazon Web Services
Connection type and fill in the appropriate values there.
Here's what it would look like (in a UI with a modified color and font):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With