I have a dataframe whose columns are RangeIndex. I want to change the names.
import pandas as pd
>>> my_df
0 1
Alpha -0.1234 0.001
Beta 0.7890 0.005
>>> my_df.columns
RangeIndex(start=0, stop=2, step=1)
I want to do something like:
my_df = my_df.rename({'0': 'Betas', '1': 'P-values})
And it should look like:
>>> my_df
Betas P-values
Alpha -0.1234 0.001
Beta 0.7890 0.005
But it does not change the column names.
One way of renaming the columns in a Pandas Dataframe is by using the rename() function. This method is quite useful when we need to rename some selected columns because we need to specify information only for the columns which are to be renamed.
Pandas, however, can be tricked into allowing duplicate column names. Duplicate column names are a problem if you plan to transfer your data set to another statistical language. They're also a problem because it will cause unanticipated and sometimes difficult to debug problems in Python.
Pandas rename() method is used to rename any index, column or row.
To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.
Simple and straightforward.
my_df.rename(columns = { '0': 'Betas', '1': 'P-values' }, inplace=True)
Even nicer as borrowed from Edchum
my_df.columns = ['Betas', 'P-values']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With