Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In pandas, can I deeply copy a DataFrame including its index and column?

Tags:

python

pandas

First, I create a DataFrame

In [61]: import pandas as pd
In [62]: df = pd.DataFrame([[1], [2], [3]])

Then, I deeply copy it by copy

In [63]: df2 = df.copy(deep=True)

Now the DataFrame are different.

In [64]: id(df), id(df2)
Out[64]: (4385185040, 4385183312)

However, the index are still the same.

In [65]: id(df.index), id(df2.index)
Out[65]: (4385175264, 4385175264)

Same thing happen in columns, is there any way that I can easily deeply copy it not only values but also index and columns?

like image 450
waitingkuo Avatar asked Jul 11 '13 10:07

waitingkuo


People also ask

How do I deep copy a DataFrame in pandas?

To create deep copy of Pandas DataFrame, use df. copy() or df. copy(deep=True) method.

Is pandas copy a deep copy?

Pandas DataFrame copy() MethodBy default, the copy is a "deep copy" meaning that any changes made in the original DataFrame will NOT be reflected in the copy.

How do I copy an index in pandas?

copy() function make a copy of this object. The function also sets the name and dtype attribute of the new object as that of original object. If we wish to have a different datatype for the new object then we can do that by setting the dtype attribute of the function.


2 Answers

Latest version of Pandas does not have this issue anymore

  import pandas as pd
  df = pd.DataFrame([[1], [2], [3]])

  df2 = df.copy(deep=True)

  id(df), id(df2)
  Out[3]: (136575472, 127792400)

  id(df.index), id(df2.index)
  Out[4]: (145820144, 127657008)
like image 173
Sergey Avatar answered Oct 08 '22 05:10

Sergey


I wonder whether this is a bug in pandas... it's interesting because Index/MultiIndex (index and columns) are in some sense supposed to be immutable (however I think these should be copies).

For now, it's easy to create your own method, and add it to DataFrame:

In [11]: def very_deep_copy(self):
    return pd.DataFrame(self.values.copy(), self.index.copy(), self.columns.copy())

In [12]: pd.DataFrame.very_deep_copy = very_deep_copy

In [13]: df2 = df.very_deep_copy()

As you can see this will create new objects (and preserve names):

In [14]: id(df.columns)
Out[14]: 4370636624

In [15]: id(df2.columns)
Out[15]: 4372118776
like image 11
Andy Hayden Avatar answered Oct 08 '22 06:10

Andy Hayden