In my python program I need multiple copies of a tree. Initially, I use deepcopy from the copy module, which turns out to be very slow. Then I write my own code to copy a tree, the code traverses the tree being copied and create a new node at each node being visited. Then I call this subroutines multiple times to get multiple copies. This solution is much faster (~40 times faster) than deepcopy.
Solution 2: Then I think, traversing a tree needs time T, make n copies, the time required is nT; but if I create n new nodes for each node being copied, I only need to traverse the tree being copied once, although at each node, you copy multiple nodes. Will this be faster? The result turns out to be: not much.
Still the copy operation is the bottleneck of my program. Is there any faster way to do that? Thanks! Stats -- using custom copy_tree function;
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 10.406 10.406 <string>:1(<module>)
1 0.002 0.002 10.406 10.406 C:\Python27\sdk.py:1431(algorithm1)
26 0.005 0.000 4.602 0.177 C:\Python27\sdk.py:1310(engage)
1342 0.005 0.000 4.208 0.003 C:\Python27\lib\idlelib\rpc.py:594(__call__)
1342 0.007 0.000 4.203 0.003 C:\Python27\lib\idlelib\rpc.py:208(remotecall)
1342 0.017 0.000 3.992 0.003 C:\Python27\lib\idlelib\rpc.py:238(asyncreturn)
1342 0.005 0.000 3.972 0.003 C:\Python27\lib\idlelib\rpc.py:279(getresponse)
1342 0.033 0.000 3.961 0.003 C:\Python27\lib\idlelib\rpc.py:295(_getresponse)
411/26 0.202 0.000 3.930 0.151 C:\Python27\sdk.py:1227(NodeEngage)
1338 0.014 0.000 3.909 0.003 C:\Python27\lib\threading.py:235(wait)
5356 3.877 0.001 3.877 0.001 {method 'acquire' of 'thread.lock' objects}
27 0.001 0.000 3.798 0.141 C:\Python27\sdk.py:888(pick_best_group)
378 0.003 0.000 3.797 0.010 C:\Python27\sdk.py:862(group_info)
46947/378 0.155 0.000 3.786 0.010 C:\Python27\sdk.py:833(core_possibilities)
27490 0.114 0.000 3.547 0.000 C:\Python27\sdk.py:779(find_cores)
46569 1.046 0.000 3.424 0.000 C:\Python27\sdk.py:798(find_a_true_core)
280274 0.873 0.000 1.464 0.000 C:\Python27\sdk.py:213(next)
27 0.002 0.000 1.393 0.052 C:\Python27\sdk.py:1008(s)
28196 0.016 0.000 1.070 0.000 C:\Python27\sdk.py:1000(copy_tree)
.............................Compare with deepcopy approach
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 191.193 191.193 <string>:1(<module>)
1 0.002 0.002 191.193 191.193 C:\Python27\sdk.py:1431(algorithm1)
26 0.006 0.000 185.611 7.139 C:\Python27\sdk.py:1310(engage)
411/26 1.200 0.003 185.013 7.116 C:\Python27\sdk.py:1227(NodeEngage)
30033397/28196 56.608 0.000 177.885 0.006 C:\Python27\lib\copy.py:145(deepcopy)
3340177/28196 15.354 0.000 177.741 0.006 C:\Python27\lib\copy.py:283(_deepcopy_inst)
6680354/28196 23.276 0.000 177.261 0.006 C:\Python27\lib\copy.py:253(_deepcopy_dict)
3340177/150307 22.345 0.000 171.525 0.001 C:\Python27\lib\copy.py:234(_deepcopy_tuple)
13360708 23.793 0.000 23.793 0.000 {hasattr}
13614747 12.483 0.000 15.349 0.000 C:\Python27\lib\copy.py:267(_keep_alive)
1342 0.005 0.000 7.281 0.005 C:\Python27\lib\idlelib\rpc.py:594(__call__)
1342 0.008 0.000 7.276 0.005 C:\Python27\lib\idlelib\rpc.py:208(remotecall)
1342 0.019 0.000 7.039 0.005 C:\Python27\lib\idlelib\rpc.py:238(asyncreturn)
1342 0.005 0.000 7.018 0.005 C:\Python27\lib\idlelib\rpc.py:279(getresponse)
1342 0.035 0.000 7.006 0.005 C:\Python27\lib\idlelib\rpc.py:295(_getresponse)
43649486 6.971 0.000 6.971 0.000 {method 'get' of 'dict' objects}
1341 0.015 0.000 6.950 0.005 C:\Python27\lib\threading.py:235(wait)
5365 6.917 0.001 6.917 0.001 {method 'acquire' of 'thread.lock' objects}
6680354 5.325 0.000 5.325 0.000 {method 'iteritems' of 'dict' objects}
57037048 4.854 0.000 4.854 0.000 {id}
@ThomasH: this is the copy function, which is very simple and custom. See my comment to Ross for the content of tree nodes
def r_copy_tree(node_to_copy, dad_info):
new_node = node(dad_info)
for (a,son_to_copy) in node_to_copy.sons.items():
new_node.sons[a]=r_copy_tree(son_to_copy,(new_node,a))
return new_node
def copy_tree(root):
return r_copy_tree(root,(None,None))
When trying to improve performance you should almost always start with profiling data and then optimize based on what you see there. Start by using cProfile.run
to run your top-level tree copy code, then use pstats.Stats
class to inspect the profiling data and see where you should really focus your optimization. I recommend starting by sorting your stats by cumulative
time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With