I am looking for real world applications where topological sorting is performed on large graph sizes.
Some fields where I image you could find such instances would be bioinformatics, dependency resolution, databases, hardware design, data warehousing... but I hope some of you may have encountered or heard of any specific algorithms/projects/applications/datasets that require topsort.
Even if the data/project may not be publicly accessible any hints (and estimates on the order of magnitude of potential graph sizes) might be helpful.
In topological sorting, we need to print a vertex before its adjacent vertices. For example, in the given graph, the vertex '5' should be printed before vertex '0', but unlike DFS, the vertex '4' should also be printed before vertex '0'. So Topological sorting is different from DFS.
Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.
So topological sorting can be achieved for only directed and acyclic graphs.
Scheduling jobs from given dependencies among Jobs. For example, if some job requires the dependency of some other job, then we can use topological sorting. Determining the order of compilation tasks to perform in makefiles, data serializations and resolving symbol dependencies in linkers.
Here are some examples I've seen so far for Topological Sorting:
While scheduling task graphs in a distributed system, it is usually needed to sort the tasks topologically and then assign them to resources. I am aware of task graphs containing more than 100,000 tasks to be sorted in a topological order. See this in this context.
Once upon a time I was working on a Document Management System. Each document on this system has some kind of precedence constraint to a set of other documents, e.g. its content type or field referencing. Then, the system should be able to generate an order of the documents with the preserved topological order. As I can remember, there were around 5,000,000 documents available two years ago !!!
In the field of social networking, there is famous query to know the largest friendship distance in the network. This problem needs to traverse the graph by a BFS approach, equal to the cost of a topological sorting. Consider the members of Facebook and find your answer.
If you need more real examples, do not hesitate to ask me. I have worked in lots of projects working on on large graphs.
P.S. for large DAG datasets, you may take a look at Stanford Large Network Dataset Collection and Graphics@ Illinois page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With