I want to find the way change name of specific column in a multilevel dataframe.
With this data:
data = {
('A', '1', 'I'): [1, 2, 3, 4, 5],
('B', '2', 'II'): [1, 2, 3, 4, 5],
('C', '3', 'I'): [1, 2, 3, 4, 5],
('D', '4', 'II'): [1, 2, 3, 4, 5],
('E', '5', 'III'): [1, 2, 3, 4, 5],
}
dataDF = pd.DataFrame(data)
This code not working:
dataDF.rename(columns = {('A', '1', 'I'):('Z', '100', 'Z')}, inplace=True)
Result:
A B C D E
1 2 3 4 5
I II I II III
0 1 1 1 1 1
1 2 2 2 2 2
2 3 3 3 3 3
3 4 4 4 4 4
4 5 5 5 5 5
And also not:
dataDF.columns.values[0] = ('Z', '100', 'Z')
Result:
A B C D E
1 2 3 4 5
I II I II III
0 1 1 1 1 1
1 2 2 2 2 2
2 3 3 3 3 3
3 4 4 4 4 4
4 5 5 5 5 5
But with combination of above codes working!!!
dataDF.columns.values[0] = ('Z', '100', 'Z')
dataDF.rename(columns = {('A', '1', 'I'):('Z', '100', 'Z')}, inplace=True)
dataDF
Result:
Z B C D E
100 2 3 4 5
Z II I II III
0 1 1 1 1 1
1 2 2 2 2 2
2 3 3 3 3 3
3 4 4 4 4 4
4 5 5 5 5 5
Is this bug of Pandas?
Renaming the Multiindex Columns To rename the multi index columns of the pandas dataframe, you need to use the set_levels() method. Use the below snippet to rename the multi level columns. where, ['b1','c1','d1'] - New column names of the index.
One way of renaming the columns in a Pandas Dataframe is by using the rename() function. This method is quite useful when we need to rename some selected columns because we need to specify information only for the columns which are to be renamed.
You can use the rename() method of pandas. DataFrame to change column/index name individually. Specify the original name and the new name in dict like {original name: new name} to columns / index parameter of rename() .
This is my theory
pandas does not want pd.Index
s to be mutable. We can see this if we try to change the first element of the index ourselves
dataDF.columns[0] = ('Z', '100', 'Z')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-32-2c0b76762235> in <module>()
----> 1 dataDF.columns[0] = ('Z', '100', 'Z')
//anaconda/envs/3.5/lib/python3.5/site-packages/pandas/indexes/base.py in __setitem__(self, key, value)
1372
1373 def __setitem__(self, key, value):
-> 1374 raise TypeError("Index does not support mutable operations")
1375
1376 def __getitem__(self, key):
TypeError: Index does not support mutable operations
But pandas can't control what you do the values
attribute.
dataDF.columns.values[0] = ('Z', '100', 'Z')
we see that dataDF.columns
looks the same, but dataDF.columns.values
clearly reflects the change. Unfortunately, df.columns.values
isn't what shows up on the display of the dataframe.
On the other hand, this really does seem like it should work. The fact that it doesn't feels wrong to me.
dataDF.rename(columns={('A', '1', 'I'): ('Z', '100', 'Z')}, inplace=True)
I believe the reason this only works after having changed the values, is that rename
is forcing the reconstruction of the columns by looking at the values. Since we change the values, it now works. This is exceptionally kludgy and I don't recommend building a process that relies on this.
my recommendation
from_col = ('A', '1', 'I')
to_col = ('Z', '100', 'Z')
colloc = dataDF.columns.get_loc(from_col)
cvals = dataDF.columns.values
cvals[colloc] = to_col
dataDF.columns = pd.MultiIndex.from_tuples(cvals.tolist())
dataDF
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With