Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Relationship between BFS and topological sort

Topological sort can be done using both a DFS(having edges reversed) and also using a queue . A BFS can also be done using a queue . Is there any relationship between the way elements are stored and retrieved while using queue for a BFS to that when used a queue for topological sorting . Clarification will be helpful . Thanks.

like image 364
motiur Avatar asked Sep 11 '12 15:09

motiur


1 Answers

No, there is not necessarily any relationship. I assume you are referring to the algorithm by Kahn from wikipedia/Topological_sorting#Algorithms, which wikipedia notes:

Note that, reflecting the non-uniqueness of the resulting sort, the structure S can be simply a set or a queue or a stack.

Thus the "queue" for topological sorting is really "any collection" structure and the ordering in this collection does not matter; it can be anything. The queue used for BFS on the other hand, is all about the order; so that it can accomplish its FIFO (first-in, first-out) task. Changing this ordering will ruin the BFS algorithm.

There might be other "queue" based algorithms for topological sort, where it does matter that the structure is a queue. If you are asking about a particular such algorithm, please clarify.

EDIT: Algorithm of interest is clarified to be Improved algorithm section, which is the same as Kahn's.

EDIT: I've written some code that implements topological sort according to the Improved algorithm section in the page you linked. I made the type of collection it uses arbitrary as an argument of the sort function. I then make a few types of such collections, including a stack, a queue, a random-pop-collection and a python set (its a hashset, so no guarantees on order).

I then make a graph, and test the sorting algorithm on it with each collection. Then I test each of the results using the definition listed on wikipedia of topological sort:

.. a topological sort (sometimes abbreviated topsort or toposort) or topological ordering of a directed graph is a linear ordering of its vertices such that, for every edge uv, u comes before v in the ordering.

― wikipedia

The code is written in python and follows. The result is here from http://ideone.com. I don't know a good easy way to generate random DAGs for testing so my test graph is lame. Feel free to comment/edit a good DAG generator.

EDIT: Now I have a less lame generator, but it uses networkx. The function nx_generate_random_dag is in the code, but it imports networkx in the function. You can uncomment the marked section in main to generate graphs. I hardcoded a generated graph into the code, so we get more interesting results.

All of this is to show, that the ordering of the "collection" data structure (the queue in the algorithm) can be in any order.

from collections import deque
import random


def is_topsorted(V,E,sequence):
  sequence = list(sequence)
  #from wikipedia definition of top-sort
  #for every edge uv, u comes before v in the ordering
  for u,v in E:
    ui = sequence.index(u)
    vi = sequence.index(v)
    if not (ui < vi):
      return False
  return True 

#the collection_type should behave like a set:
# it must have add(), pop() and __len__() as members.
def topsort(V,E,collection_type):
  #out edges
  INS = {}

  #in edges
  OUTS = {}
  for v in V:
    INS[v] = set()
    OUTS[v] = set()

  #for each edge u,v,
  for u,v in E:
    #record the out-edge from u
    OUTS[u].add(v)
    #record the in-edge to v
    INS[v].add(u)

  #1. Store all vertices with indegree 0 in a queue
  #We will start
  topvertices = collection_type()

  for v,in_vertices in INS.iteritems():
    if len(in_vertices) == 0:
      topvertices.add(v)

  result = []

  #4. Perform steps 2 and 3 while the queue is not empty.
  while len(topvertices) != 0:  
    #2. get a vertex U and place it in the sorted sequence (array or another queue).
    u = topvertices.pop()
    result.append(u)

    #3. For all edges (U,V) update the indegree of V,
    # and put V in the queue if the updated indegree is 0.

    for v in OUTS[u]:
      INS[v].remove(u)
      if len(INS[v]) == 0:
        topvertices.add(v)

  return result

class stack_collection:
  def __init__(self):
    self.data = list()
  def add(self,v):
    self.data.append(v)
  def pop(self):
    return self.data.pop()
  def __len__(self):
    return len(self.data)

class queue_collection:
  def __init__(self):
    self.data = deque()
  def add(self,v):
    self.data.append(v)
  def pop(self):
    return self.data.popleft()
  def __len__(self):
    return len(self.data)

class random_orderd_collection:
  def __init__(self):
    self.data = []
  def add(self,v):
    self.data.append(v)
  def pop(self):    
    result = random.choice(self.data)
    self.data.remove(result)
    return result
  def __len__(self):
    return len(self.data)

"""
Poor man's graph generator.
Requires networkx.

Don't make the edge_count too high compared with the vertex count,
 otherwise it will run for a long time or forever.
"""
def nx_generate_random_dag(vertex_count,edge_count):
  import networkx as nx

  V = range(1,vertex_count+1)
  random.shuffle(V)

  G = nx.DiGraph()
  G.add_nodes_from(V)

  while nx.number_of_edges(G) < edge_count:

    u = random.choice(V)
    v = random.choice(V)
    if u == v:
      continue

    for tries in range(2):
      G.add_edge(u,v)
      if not nx.is_directed_acyclic_graph(G):
        G.remove_edge(u,v)
        u,v = v,u
  V = G.nodes()
  E = G.edges()

  assert len(E) == edge_count
  assert len(V) == vertex_count
  return V,E




def main():

  graphs = []

  V = [1,2,3,4,5]
  E = [(1,2),(1,5),(1,4),(2,4),(2,5),(3,4),(3,5)]

  graphs.append((V,E))

  """
  Uncomment this section if you have networkx.
  This will generate 3 random graphs.
  """
  """
  for i in range(3):
    G = nx_generate_random_dag(30,120)
    V,E = G
    print 'random E:',E
    graphs.append(G)
  """


  #This graph was generated using nx_generate_random_dag() from above
  V = range(1,31)
  E = [(1, 10), (1, 11), (1, 14), (1, 17), (1, 18), (1, 21), (1, 23),
       (1, 30), (2, 4), (2, 12), (2, 15), (2, 17), (2, 18), (2, 19),
       (2, 25), (3, 22), (4, 5), (4, 8), (4, 22), (4, 23), (4, 26),
       (5, 27), (5, 23), (6, 24), (6, 28), (6, 27), (6, 20), (6, 29),
       (7, 3), (7, 19), (7, 13), (8, 24), (8, 10), (8, 3), (8, 12),
       (9, 4), (9, 8), (9, 10), (9, 14), (9, 19), (9, 27), (9, 28),
       (9, 29), (10, 18), (10, 5), (10, 23), (11, 27), (11, 5),
       (12, 10), (13, 9), (13, 26), (13, 3), (13, 12), (13, 6), (14, 24),
       (14, 28), (14, 18), (14, 20), (15, 3), (15, 12), (15, 17), (15, 19),
       (15, 25), (15, 27), (16, 4), (16, 5), (16, 8), (16, 18), (16, 20), (16, 23),
       (16, 26), (16, 28), (17, 4), (17, 5), (17, 8), (17, 12), (17, 22), (17, 28),
       (18, 11), (18, 3), (19, 10), (19, 18), (19, 5), (19, 22), (20, 5), (20, 29),
       (21, 25), (21, 12), (21, 30), (21, 17), (22, 11), (24, 3), (24, 10),
       (24, 11), (24, 28), (25, 10), (25, 17), (25, 23), (25, 27), (26, 3),
       (26, 18), (26, 19), (28, 26), (28, 11), (28, 23), (29, 2), (29, 4),
       (29, 11), (29, 15), (29, 17), (29, 22), (29, 23), (30, 3), (30, 7),
       (30, 17), (30, 20), (30, 25), (30, 26), (30, 28), (30, 29)]

  graphs.append((V,E))

  #add other graphs here for testing


  for G in graphs:
    V,E = G

    #sets in python are unordered but in practice their hashes usually order integers.
    top_set = topsort(V,E,set)

    top_stack = topsort(V,E,stack_collection)

    top_queue = topsort(V,E,queue_collection)

    random_results = []
    for i in range(0,10):
      random_results.append(topsort(V,E,random_orderd_collection))

    print
    print 'V: ', V
    print 'E: ', E
    print 'top_set ({0}): {1}'.format(is_topsorted(V,E,top_set),top_set)
    print 'top_stack ({0}): {1}'.format(is_topsorted(V,E,top_stack),top_stack)
    print 'top_queue ({0}): {1}'.format(is_topsorted(V,E,top_queue),top_queue)

    for random_result in random_results:
      print 'random_result ({0}): {1}'.format(is_topsorted(V,E,random_result),random_result)
      assert is_topsorted(V,E,random_result)

    assert is_topsorted(V,E,top_set)
    assert is_topsorted(V,E,top_stack)
    assert is_topsorted(V,E,top_queue)



main()
like image 120
Realz Slaw Avatar answered Oct 04 '22 19:10

Realz Slaw