I am trying to use the airflow.providers.amazon.aws.operators.s3_list S3ListOperator to list files in an S3 bucket in my AWS account with the DAG operator below:
list_bucket = S3ListOperator(
task_id = 'list_files_in_bucket',
bucket = '<MY_BUCKET>',
aws_conn_id = 's3_default'
)
I have configured my Extra Connection details in the form of: {"aws_access_key_id": "<MY_ACCESS_KEY>", "aws_secret_access_key": "<MY_SECRET_KEY>"}
When I run my Airflow job, it appears it is executing fine & my task status is Success. Here is the Log output:
[2021-04-27 11:44:50,009] {base_aws.py:368} INFO - Airflow Connection: aws_conn_id=s3_default
[2021-04-27 11:44:50,013] {base_aws.py:170} INFO - Credentials retrieved from extra_config
[2021-04-27 11:44:50,013] {base_aws.py:84} INFO - Creating session with aws_access_key_id=<MY_ACCESS_KEY> region_name=None
[2021-04-27 11:44:50,027] {base_aws.py:157} INFO - role_arn is None
[2021-04-27 11:44:50,661] {taskinstance.py:1185} INFO - Marking task as SUCCESS. dag_id=two_step, task_id=list_files_in_bucket, execution_date=20210427T184422, start_date=20210427T184439, end_date=20210427T184450
[2021-04-27 11:44:50,676] {taskinstance.py:1246} INFO - 0 downstream tasks scheduled from follow-on schedule check
[2021-04-27 11:44:50,700] {local_task_job.py:146} INFO - Task exited with return code 0
Is there anything I can do to print the files in my bucket to Logs? TIA
This code is enough and you don't need to use print function. Just check the corresponding log, then go to xcom, and the return list is there.
list_bucket = S3ListOperator(
task_id='list_files_in_bucket',
bucket='ob-air-pre',
prefix='data/',
delimiter='/',
aws_conn_id='aws'
)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With