Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If I want non-recursive deep copy of my object, should I override copy or deepcopy in Python?

An object of my class has a list as its attribute. That is,

class T(object):
    def __init__(self, x, y):
        self.arr = [x, y]

When this object is copied, I want a separate list arr, but a shallow copy of the content of the list (e.g. x and y). Therefore I decide to implement my own copy method, which will recreate the list but not the items in it. But should I call this __copy__() or __deepcopy__()? Which one is the right name for what I do, according to Python semantics?

My guess is __copy__(). If I call deepcopy(), I would expect the clone to be totally decoupled from the original. However, the documentation says:

A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

It is confusing by saying "inserts copies into it" instead of "inserts deep copies into it", especially with the emphasis.

like image 542
user31039 Avatar asked Feb 21 '17 23:02

user31039


People also ask

What is the difference between copy copy () and copy Deepcopy ()?

copy() create reference to original object. If you change copied object - you change the original object. . deepcopy() creates new object and does real copying of original object to new one. Changing new deepcopied object doesn't affect original object.

Does copy () make a deep copy Python?

Shallow and deep copy in Python: copy(), deepcopy() In Python, you can make a shallow and deep copy with the copy() method of list , dictionary, etc., or the copy() and deepcopy() functions of the copy module.

Is Deep copy recursive?

A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Which is better deep copy or shallow copy?

Shallow Copy stores the copy of the original object and points the references to the objects. Deep copy stores the copy of the original object and recursively copies the objects as well. Shallow copy is faster. Deep copy is comparatively slower.


2 Answers

The correct magic method for you to implement here is __copy__.

The behaviour you've described is deeper than what the default behaviour of copy would do (i.e. for an object which hasn't bothered to implement __copy__), but it is not deep enough to be called a deepcopy. Therefore you should implement __copy__ to get the desired behaviour.

Do not make the mistake of thinking that simply assigning another name makes a "copy":

t1 = T('google.com', 123)
t2 = t1  # this does *not* use __copy__

That just binds another name to the same instance. Rather, the __copy__ method is hooked into by a function:

import copy
t2 = copy.copy(t1)
like image 68
wim Avatar answered Sep 18 '22 23:09

wim


It actually depends on the desired behaviour of your class that factor into the decision what to override (__copy__ or __deepcopy__).

In general copy.deepcopy works mostly correctly, it just copies everything (recursivly) so you only need to override it if there is some attribute that must not be copied (ever!).

On the other hand one should define __copy__ only if users (including yourself) wouldn't expect changes to propagate to copied instances. For example if you simply wrap a mutable type (like list) or use mutable types as implementation detail.

Then there is also the case that the minimal set of attributes to copy isn't clearly defined. In that case I would also override __copy__ but maybe raise a TypeError in there and possible include one (or several) dedicated public copy methods.

However in my opinion the arr counts as implementation detail and thus I would override __copy__:

class T(object):
    def __init__(self, x, y):
        self.arr = [x, y]

    def __copy__(self):
        new = self.__class__(*self.arr)
        # ... maybe other stuff
        return new

Just to show it works as expected:

from copy import copy, deepcopy

x = T([2], [3])
y = copy(x)
x.arr is y.arr        # False
x.arr[0] is y.arr[0]  # True
x.arr[1] is y.arr[1]  # True

x = T([2], [3])
y = deepcopy(x)
x.arr is y.arr        # False
x.arr[0] is y.arr[0]  # False
x.arr[1] is y.arr[1]  # False

Just a quick note about expectations:

Users generally expect that you can pass an instance to the constructor to create a minimal copy (similar or identical to __copy__) as well. For example:

lst1 = [1,2,3,4]
lst2 = list(lst1)
lst1 is lst2        # False

Some Python types have an explicit copy method, that (if present) should do the same as __copy__. That would allow to explicitly pass in parameters (however I haven't seen this in action yet):

lst3 = lst1.copy()  # python 3.x only (probably)
lst3 is lst1        # False

If your class should be used by others you probably need to consider these points, however if you only want to make your class work with copy.copy then just overwrite __copy__.

like image 21
MSeifert Avatar answered Sep 18 '22 23:09

MSeifert