I have a tree structure of widgets e.g. collection contains models and model contains widgets. I want to copy whole collection, copy.deepcopy
is faster in comparison to 'pickle and de-pickle'ing the object but cPickle as being written in C is much faster, so
Sample test code:
import copy import pickle import cPickle class A(object): pass d = {} for i in range(1000): d[i] = A() def copy1(): return copy.deepcopy(d) def copy2(): return pickle.loads(pickle.dumps(d, -1)) def copy3(): return cPickle.loads(cPickle.dumps(d, -1))
Timings:
>python -m timeit -s "import c" "c.copy1()" 10 loops, best of 3: 46.3 msec per loop >python -m timeit -s "import c" "c.copy2()" 10 loops, best of 3: 93.3 msec per loop >python -m timeit -s "import c" "c.copy3()" 100 loops, best of 3: 17.1 msec per loop
copy() create reference to original object. If you change copied object - you change the original object. . deepcopy() creates new object and does real copying of original object to new one. Changing new deepcopied object doesn't affect original object.
Deep copy is a process in which the copying process occurs recursively. It means first constructing a new collection object and then recursively populating it with copies of the child objects found in the original.
deepcopy() is extremely slow.
Deep copy doesn't reflect changes made to the new/copied object in the original object. Shallow Copy stores the copy of the original object and points the references to the objects. Deep copy stores the copy of the original object and recursively copies the objects as well. Shallow copy is faster.
Problem is, pickle+unpickle can be faster (in the C implementation) because it's less general than deepcopy: many objects can be deepcopied but not pickled. Suppose for example that your class A
were changed to...:
class A(object): class B(object): pass def __init__(self): self.b = self.B()
now, copy1
still works fine (A's complexity slows it downs but absolutely doesn't stop it); copy2
and copy3
break, the end of the stack trace says...:
File "./c.py", line 20, in copy3 return cPickle.loads(cPickle.dumps(d, -1)) PicklingError: Can't pickle <class 'c.B'>: attribute lookup c.B failed
I.e., pickling always assumes that classes and functions are top-level entities in their modules, and so pickles them "by name" -- deepcopying makes absolutely no such assumptions.
So if you have a situation where speed of "somewhat deep-copying" is absolutely crucial, every millisecond matters, AND you want to take advantage of special limitations that you KNOW apply to the objects you're duplicating, such as those that make pickling applicable, or ones favoring other forms yet of serializations and other shortcuts, by all means go ahead - but if you do you MUST be aware that you're constraining your system to live by those limitations forevermore, and document that design decision very clearly and explicitly for the benefit of future maintainers.
For the NORMAL case, where you want generality, use deepcopy
!-)
You should be using deepcopy because it makes your code more readable. Using a serialization mechanism to copy objects in memory is at the very least confusing to another developer reading your code. Using deepcopy also means you get to reap the benefits of future optimizations in deepcopy.
First rule of optimization: don't.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With