I have looked at some common tools like Heapy to measure how much memory is being utilized by each traversal technique but I don't know if they are giving me the right results. Here is some code to give the context.
The code simply measures the number of unique nodes in a graph. Two traversal techniques provided viz. count_bfs
and count_dfs
import sys
from guppy import hpy
class Graph:
def __init__(self, key):
self.key = key #unique id for a vertex
self.connections = []
self.visited = False
def count_bfs(start):
parents = [start]
children = []
count = 0
while parents:
for ind in parents:
if not ind.visited:
count += 1
ind.visited = True
for child in ind.connections:
children.append(child)
parents = children
children = []
return count
def count_dfs(start):
if not start.visited:
start.visited = True
else:
return 0
n = 1
for connection in start.connections:
n += count_dfs(connection)
return n
def construct(file, s=1):
"""Constructs a Graph using the adjacency matrix given in the file
:param file: path to the file with the matrix
:param s: starting node key. Defaults to 1
:return start vertex of the graph
"""
d = {}
f = open(file,'rU')
size = int(f.readline())
for x in xrange(1,size+1):
d[x] = Graph(x)
start = d[s]
for i in xrange(0,size):
l = map(lambda x: int(x), f.readline().split())
node = l[0]
for child in l[1:]:
d[node].connections.append(d[child])
return start
if __name__ == "__main__":
s = construct(sys.argv[1])
#h = hpy()
print(count_bfs(s))
#print h.heap()
s = construct(sys.argv[1])
#h = hpy()
print(count_dfs(s))
#print h.heap()
I want to know by what factor is the total memory utilization different in the two traversal techniques viz. count_dfs
and count_bfs
? One might have the intuition that dfs
may be expensive as a new stack is created for every function call. How can the total memory allocations in each traversal technique be measured?
Do the (commented) hpy
statements give the desired measure?
Sample file with connections:
4
1 2 3
2 1 3
3 4
4
This being a python question, it may be more important how much stack space is used than how much total memory. Cpython has a low limit of 1000 frames because it shares its call stack with the c call stack, which in turn is limited to the order of one megabyte in most places. For this reason you should almost* always prefer iterative solutions to recursive ones when the recursion depth is unbounded.
* other implementations of python may not have this restriction. The stackless variants of cpython and pypy have this exact property
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With