Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between shallow copy, deepcopy and normal assignment operation?

People also ask

What does a shallow copy mean?

A shallow copy of an object is a copy whose properties share the same references (point to the same underlying values) as those of the source object from which the copy was made.

What is the difference between a shallow copy and deep copy?

Shallow Copy stores the references of objects to the original memory address. Deep copy stores copies of the object's value. Shallow Copy reflects changes made to the new/copied object in the original object. Deep copy doesn't reflect changes made to the new/copied object in the original object.

What is the difference between shallow copy and Deepcopy in JavaScript?

A deep copy means that all of the values of the new variable are copied and disconnected from the original variable. A shallow copy means that certain (sub-)values are still connected to the original variable. To really understand copying, you have to get into how JavaScript stores values.

What is the main difference between reference copy and shallow copy?

All the data member and reference are copied and assigned to new object. So, unlike reference copy, newly created object point to its own “Actual object” . However , in shallow copy reference are also copied i.e, the all the reference object inside object still point to the same object as the original object.


Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Here's a little demonstration:

import copy

a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]

Using normal assignment operatings to copy:

d = c

print id(c) == id(d)          # True - d is the same object as c
print id(c[0]) == id(d[0])    # True - d[0] is the same object as c[0]

Using a shallow copy:

d = copy.copy(c)

print id(c) == id(d)          # False - d is now a new object
print id(c[0]) == id(d[0])    # True - d[0] is the same object as c[0]

Using a deep copy:

d = copy.deepcopy(c)

print id(c) == id(d)          # False - d is now a new object
print id(c[0]) == id(d[0])    # False - d[0] is now a new object

For immutable objects, there is no need for copying because the data will never change, so Python uses the same data; ids are always the same. For mutable objects, since they can potentially change, [shallow] copy creates a new object.

Deep copy is related to nested structures. If you have list of lists, then deepcopy copies the nested lists also, so it is a recursive copy. With just copy, you have a new outer list, but inner lists are references.

Assignment does not copy. It simply sets the reference to the old data. So you need copy to create a new list with the same contents.


For immutable objects, creating a copy don't make much sense since they are not going to change. For mutable objects assignment,copy and deepcopy behaves differently. Lets talk about each of them with examples.

An assignment operation simply assigns the reference of source to destination e.g:

>>> i = [1,2,3]
>>> j=i
>>> hex(id(i)), hex(id(j))
>>> ('0x10296f908', '0x10296f908') #Both addresses are identical

Now i and j technically refers to same list. Both i and j have same memory address. Any updation to either of them will be reflected to the other. e.g:

>>> i.append(4)
>>> j
>>> [1,2,3,4] #Destination is updated

>>> j.append(5)
>>> i
>>> [1,2,3,4,5] #Source is updated

On the other hand copy and deepcopy creates a new copy of variable. So now changes to original variable will not be reflected to the copy variable and vice versa. However copy(shallow copy), don't creates a copy of nested objects, instead it just copies the reference of nested objects. Deepcopy copies all the nested objects recursively.

Some examples to demonstrate behaviour of copy and deepcopy:

Flat list example using copy:

>>> import copy
>>> i = [1,2,3]
>>> j = copy.copy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different

>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable

Nested list example using copy:

>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.copy(i)

>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different

>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x10296f908') #Nested lists have same address

>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5,6]] #Updation of original nested list updated the copy as well

Flat list example using deepcopy:

>>> import copy
>>> i = [1,2,3]
>>> j = copy.deepcopy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different

>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable

Nested list example using deepcopy:

>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.deepcopy(i)

>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different

>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x102b9b7c8') #Nested lists have different addresses

>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5]] #Updation of original nested list didn't affected the copied variable    

Let's see in a graphical example how the following code is executed:

import copy

class Foo(object):
    def __init__(self):
        pass


a = [Foo(), Foo()]
shallow = copy.copy(a)
deep = copy.deepcopy(a)

enter image description here


a, b, c, d, a1, b1, c1 and d1 are references to objects in memory, which are uniquely identified by their ids.

An assignment operation takes a reference to the object in memory and assigns that reference to a new name. c=[1,2,3,4] is an assignment that creates a new list object containing those four integers, and assigns the reference to that object to c. c1=c is an assignment that takes the same reference to the same object and assigns that to c1. Since the list is mutable, anything that happens to that list will be visible regardless of whether you access it through c or c1, because they both reference the same object.

c1=copy.copy(c) is a "shallow copy" that creates a new list and assigns the reference to the new list to c1. c still points to the original list. So, if you modify the list at c1, the list that c refers to will not change.

The concept of copying is irrelevant to immutable objects like integers and strings. Since you can't modify those objects, there is never a need to have two copies of the same value in memory at different locations. So integers and strings, and some other objects to which the concept of copying does not apply, are simply reassigned. This is why your examples with a and b result in identical ids.

c1=copy.deepcopy(c) is a "deep copy", but it functions the same as a shallow copy in this example. Deep copies differ from shallow copies in that shallow copies will make a new copy of the object itself, but any references inside that object will not themselves be copied. In your example, your list has only integers inside it (which are immutable), and as previously discussed there is no need to copy those. So the "deep" part of the deep copy does not apply. However, consider this more complex list:

e = [[1, 2],[4, 5, 6],[7, 8, 9]]

This is a list that contains other lists (you could also describe it as a two-dimensional array).

If you run a "shallow copy" on e, copying it to e1, you will find that the id of the list changes, but each copy of the list contains references to the same three lists -- the lists with integers inside. That means that if you were to do e[0].append(3), then e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. But e1 would also be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. On the other hand, if you subsequently did e.append([10, 11, 12]), e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9],[10, 11, 12]]. But e1 would still be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. That's because the outer lists are separate objects that initially each contain three references to three inner lists. If you modify the inner lists, you can see those changes no matter if you are viewing them through one copy or the other. But if you modify one of the outer lists as above, then e contains three references to the original three lists plus one more reference to a new list. And e1 still only contains the original three references.

A 'deep copy' would not only duplicate the outer list, but it would also go inside the lists and duplicate the inner lists, so that the two resulting objects do not contain any of the same references (as far as mutable objects are concerned). If the inner lists had further lists (or other objects such as dictionaries) inside of them, they too would be duplicated. That's the 'deep' part of the 'deep copy'.