Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query about pandas copy() method

Tags:

python

pandas

df1 = pd.DataFrame({'A':['aaa','bbb','ccc'], 'B':[1,2,3]})
df2=df1.copy()
df1.loc[0,'A']='111' #modifying the 1st element of column A
print df1
print df2

When modifying df1 the object sf2 is not modified. I expected it because I used copy()

s1=pd.Series([[1,2],[3,4]])
s2=s1.copy()
s1[0][0]=0 #modifying the 1st element of list [1,2]
print s1
print s2

But why did s2 changed as well in this case? I expected no change of s2 because I used copy() to create it, but for my surprise, when modifying s1 the object s2 is also modified. I don't get why.

like image 680
DLopezG Avatar asked May 18 '26 09:05

DLopezG


2 Answers

This is occurring because your pd.Series is of dtype=object, so it essentially copied a bunch of references to python objects. Observe:

In [1]: import pandas as pd

In [2]: s1=pd.Series([[1,2],[3,4]])
   ...:

In [3]: s1
Out[3]:
0    [1, 2]
1    [3, 4]
dtype: object

In [4]: s1.dtype
Out[4]: dtype('O')

Since list objects are mutable, then the operation:

s1[0][0]=0

Modifies the list in-place.

This behavior is a "shallow copy", which normally isn't an issue with pandas data structures, because normally you would be using a numeric data type in which case shallow copies don't apply, or if you do use the object dtype you would be using python string objects, which are immutable.

Note, pandas containers have a different notion of a deep-copy. Notice the .copy method has a default deep=True, but from the documentation:

When deep=True (default), a new object will be created with a copy of the calling object's data and indices. Modifications to the data or indices of the copy will not be reflected in the original object (see notes below).

When deep=False, a new object will be created without copying the calling object's data or index (only references to the data and index are copied). Any changes to the data of the original will be reflected in the shallow copy (and vice versa). ... When deep=True, data is copied but actual Python objects will not be copied recursively, only the reference to the object. This is in contrast to copy.deepcopy in the Standard Library, which recursively copies object data (see examples below).

Again, this is because pandas is designed for using numeric dtypes, with some built-in support for str objects. A pd.Series of list objects is very strange indeed, and really not a good use-case for a pd.Series.

like image 145
juanpa.arrivillaga Avatar answered May 19 '26 22:05

juanpa.arrivillaga


When you copied the s1 object, it actually created a new, separate Series object and referenced it to s2 - just as you expected. However, the two list within the s1 Series object were not duplicated with the Series. It simply copied their references.

See here for a good starting point towards understanding the difference between a Python reference and an object.

Simply put, a Python variable is not the same thing as the actual Python object. Variables (like s1 and s2) are simply references that point to the memory location where the actual object lives.

Because the original Series object s1 contained two list references, versus two list objects, only the references for the internal list objects were copied (not the list objects themselves).

import pandas as pd

s1=pd.Series([[1,2],[3,4]])
# The oject referenced by variable "s1" has a memory address
print ("s1:", hex(id(s1)))
s2=s1.copy()
# The oject referenced by variable "s2" has a different memory address
print ("s2:", hex(id(s2)))
# However when you copied "s1", the 
# list items within only had their references copied
# So "s1[0]" and "s2[0]" are simply references to the same object
print ("s1[0]:", hex(id(s1[0])))
print ("s2[0]:", hex(id(s2[0])))

OUTPUT:

s1: 0x7fcdf5678898 # A different address form s2
s2: 0x7fcddee25240 # A different address form s1
s1[0]: 0x7fcdddf9f6c8 # The same address for the first list
s2[0]: 0x7fcdddf9f6c8 # The same address for the first list

@juanpa.arrivillaga is correct in her answer that you need to use a deep copy

like image 38
RightmireM Avatar answered May 20 '26 00:05

RightmireM



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!