Is is possible to limit visibility and accessibility of DAGs by user groups in Airflow?
For example, I want to have one large Airflow environment for my entire company, different teams will be using this Airflow environment for their team's workflows. Say we have team A and team B who both belong to their respective AD/LDAP groups, group A and group B. Is it possible to have group A only see the DAGs that belong to their team and vice versa with group B?
Based on my research and understanding I don't think this will be possible on a single Airflow environment. I think in order for me to do this I will need to create a separate Airflow environment for each team so that each team will have their own Airflow Dags folder containing their respective DAGs.
This page describes Airflow UI Access Control (also called Airflow Role-Based Access Control, or Airflow RBAC) in Cloud Composer. This feature provides an additional mechanism to separate users in the Airflow UI and DAG UI of your environment.
Airflow looks in your DAGS_FOLDER for modules that contain DAG objects in their global namespace and adds the objects it finds in the DagBag .
I think there are two different problems posed here:
First, LDAP authentication. Airflow provides support for LDAP authentication built on ldap3. The example in the linked doc shows how to associate Airflow roles with LDAP groups (e.g., the data_profiler_filter
part).
Second, restricting DAG access by group. As of the time of this writing, the current version of Airflow (1.9), doesn't support limiting visibility of DAGs by group. The recent work on role-based access control (RBAC) changes this. I've listed 3 different options for addressing this problem below.
The new RBAC features add support for permissions like this and is the best for fine-grained control. It uses a permission system built on Flask App Builder. This was created by a company with a very similar use case to what you mentioned which is discussed in more detail in the Jira issue.
More info can be found in:
The RBAC webserver UI is available on master now in airflow/www_rbac. Other features around RBAC are also being actively developed to further improve security in a multi-tenancy setup.
Note: There's also ongoing work on a new DAG-level access control (DLAC) feature in AIRFLOW-2267 which builds upon the RBAC work to introduce even more fine-grained control. More info can be found in the design doc and PR #3197.
A second option you can consider for medium-grained control is a multi-tenancy setup using webserver.filter_by_owner
and setting one explicit owner
(a user, not a group) for each DAG. "With this, a user will see only the dags which it is owner of, unless it is a superuser."
Aside: A related feature you might be interested in running tasks as a specific user with impersonation using run_as_user
or core.default_impersonation
.
A third option for coarse-grained control that some companies choose is to run multiple separate Airflow instances, one per team. This is probably the most practical for those looking to run multiple teams' DAGs in isolation today. If you happen to use Astronomer Enterprise, we support spinning up multiple Airflow instances.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With