I am new to python and am absolutely foxed by why the following happens -
Here is a modified version of code I found in another question on stackoverflow (original question is here: Replace single value in a pandas dataframe, when index is not known and values in column are unique):
# Create a dataframe df1
df1 = pd.DataFrame([[5, 2], [3, 4]], columns=('a', 'b'))
#print df1
df1
a b
0 5 2
1 3 4
# copy it into df2
df2=df1
#print df2
df2
a b
0 5 2
1 3 4
# modify the value in df2 in column b where column a is 3
df2.loc[df2.a == 3, 'b'] = 6
# print df2 to check that the value has changed
df2
a b
0 5 2
1 3 6
# BUT changing df2 changed df1 also! Print df1
df1
a b
0 5 2
1 3 6
Can someone please explain this? Thanks
Try the following code:
df2 = df1.copy()
What you've done is just referenced the object to a different name, while the underlying object is same, which is why changes in df2 were visible in df1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With