First, I create a DataFrame
In [61]: import pandas as pd
In [62]: df = pd.DataFrame([[1], [2], [3]])
Then, I deeply copy it by copy
In [63]: df2 = df.copy(deep=True)
Now the DataFrame
are different.
In [64]: id(df), id(df2)
Out[64]: (4385185040, 4385183312)
However, the index
are still the same.
In [65]: id(df.index), id(df2.index)
Out[65]: (4385175264, 4385175264)
Same thing happen in columns, is there any way that I can easily deeply copy it not only values but also index and columns?
To create deep copy of Pandas DataFrame, use df. copy() or df. copy(deep=True) method.
Pandas DataFrame copy() MethodBy default, the copy is a "deep copy" meaning that any changes made in the original DataFrame will NOT be reflected in the copy.
copy() function make a copy of this object. The function also sets the name and dtype attribute of the new object as that of original object. If we wish to have a different datatype for the new object then we can do that by setting the dtype attribute of the function.
Latest version of Pandas does not have this issue anymore
import pandas as pd
df = pd.DataFrame([[1], [2], [3]])
df2 = df.copy(deep=True)
id(df), id(df2)
Out[3]: (136575472, 127792400)
id(df.index), id(df2.index)
Out[4]: (145820144, 127657008)
I wonder whether this is a bug in pandas... it's interesting because Index/MultiIndex (index and columns) are in some sense supposed to be immutable (however I think these should be copies).
For now, it's easy to create your own method, and add it to DataFrame:
In [11]: def very_deep_copy(self):
return pd.DataFrame(self.values.copy(), self.index.copy(), self.columns.copy())
In [12]: pd.DataFrame.very_deep_copy = very_deep_copy
In [13]: df2 = df.very_deep_copy()
As you can see this will create new objects (and preserve names):
In [14]: id(df.columns)
Out[14]: 4370636624
In [15]: id(df2.columns)
Out[15]: 4372118776
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With