I am implementing a DAG in python. I am using a dictionary to implement the DAG. Each key represents a node in the graph. And the value associated with a key represents a set of nodes dependent on the node at that key.
Is it necessary to use an orderedDict instead of a Dict for implementing the DAG. The orderedDict preserves the order of insertion of the keys. I am wondering why would one want to preserve the insertion order of nodes in the DAG when the value at each key represents a set of nodes dependent of the node at that corresponding key?
To create a DAG in Airflow, you always have to import the DAG class. After the DAG class, come the imports of Operators. Basically, for each Operator you want to use, you have to make the corresponding import. For example, you want to execute a Python function, you have to import the PythonOperator.
Airflow is a platform to program workflows (general), including the creation, scheduling, and monitoring of workflows. Airflow implements workflows as DAGs, or Directed Acyclic Graphs. Airflow can be accessed and controlled via code, via the command-line, or via a built-in web interface.
Suppose you have the following DAG:
You could represent this DAG as a dictionary:
graph = {
'root': ['a'],
'a': ['b', 'e'],
'b': ['c', 'd'],
'd': ['e']}
You could also represent this DAG as an ordered dictionary, but that'd be unnecessary. The ordering of the key / value pairs does not matter. There's a buggy / incomplete Python DAG library that uses ordered dictionaries, but that lib isn't a good example to follow.
networkx is the gold standard for Python DAGs (and other graphs). You can create a networkx directed graph with a list of tuples that represent the graph edges:
import networkx as nx
graph = nx.DiGraph()
graph.add_edges_from([("root", "a"), ("a", "b"), ("a", "e"), ("b", "c"), ("b", "d"), ("d", "e")])
See here for more information about Python DAGs.
graphlib
is the module in the Python standard library for creating directed acyclic graphics. It was new in version 3.9.
It seems a bit redundant to copy/paste an example from the documentation, but here's a very short one:
>>> graph = {"D": {"B", "C"}, "C": {"A"}, "B": {"A"}}
>>> ts = TopologicalSorter(graph)
>>> tuple(ts.static_order())
('A', 'C', 'B', 'D')
For earlier versions of Python there is a backport: pip install graphlib_backport
or put this in your requirements.txt file:
graphlib_backport; python_version < "3.9.0"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With