I found this topological sorting function for PHP:
Source: http://www.calcatraz.com/blog/php-topological-sort-function-384/
function topological_sort($nodeids, $edges) {
$L = $S = $nodes = array();
foreach($nodeids as $id) {
$nodes[$id] = array('in'=>array(), 'out'=>array());
foreach($edges as $e) {
if ($id==$e[0]) { $nodes[$id]['out'][]=$e[1]; }
if ($id==$e[1]) { $nodes[$id]['in'][]=$e[0]; }
}
}
foreach ($nodes as $id=>$n) { if (empty($n['in'])) $S[]=$id; }
while (!empty($S)) {
$L[] = $id = array_shift($S);
foreach($nodes[$id]['out'] as $m) {
$nodes[$m]['in'] = array_diff($nodes[$m]['in'], array($id));
if (empty($nodes[$m]['in'])) { $S[] = $m; }
}
$nodes[$id]['out'] = array();
}
foreach($nodes as $n) {
if (!empty($n['in']) or !empty($n['out'])) {
return null; // not sortable as graph is cyclic
}
}
return $L;
}
I looks nice and short. Anyways, for some input - I get duplicate lines in the output - see http://codepad.org/thpzCOyn
Generally the sorting seems to be correct if I remove the duplicates with array_unique()
I checked the function with two examples and the sorting itself looks correct.
Should I just call array_unique()
on the result?
In topological sorting, we need to print a vertex before its adjacent vertices. For example, in the given graph, the vertex '5' should be printed before vertex '0', but unlike DFS, the vertex '4' should also be printed before vertex '0'. So Topological sorting is different from DFS.
Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG).
Topological Sorting can be done by both DFS as well as BFS,this post however is concerned with the BFS approach of topological sorting popularly know as Khan's Algorithm.
Topological sort simply involves running DFS on an entire graph and adding each node to the global ordering of nodes, but only after all of a node's children are visited. This ensures that parent nodes will be ordered before their child nodes, and honors the forward direction of edges in the ordering.
I'm the author of the original topological sorting function. Thanks to Alex for bringing the duplicate edge issue to my attention. I've updated the function to correctly remove duplicate edges and nodes. The updated version is here:
http://www.calcatraz.com/blog/php-topological-sort-function-384 (Same as original link)
I added the following to achieve the deduplication:
// remove duplicate nodes
$nodeids = array_unique($nodeids);
// remove duplicate edges
$hashes = array();
foreach($edges as $k=>$e) {
$hash = md5(serialize($e));
if (in_array($hash, $hashes)) { unset($edges[$k]); }
else { $hashes[] = $hash; };
}
I had to serialize the edges to make sure duplicates were removed correctly. I also tidied the rest of the function up a bit and added some comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With