I looked around but could not find the solution for this. In R's dplyr
we can select and rename column in one line of code.
select(Com=Commander,Sco=Score)
I'm trying to do the same thing in pandas but could not find feasible solution for it yet!
Let's say we have this sample data
# Create an example dataframe
data = {'Commander': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'Date': ['2012, 02, 08', '2012, 02, 08', '2012, 02, 08', '2012, 02, 08', '2012, 02, 08'],
'Score': [4, 24, 31, 2, 3]}
df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df
Commander Date Score
Cochice Jason 2012, 02, 08 4
Pima Molly 2012, 02, 08 24
Santa Cruz Tina 2012, 02, 08 31
Maricopa Jake 2012, 02, 08 2
Yuma Amy 2012, 02, 08 3
and want to select and rename Commander and Score columns like this
df[['Com'=='Commander','Sco'=='Score']]
ValueError: Item wrong length 2 instead of 5.
How can I do that ?
A bit late, and maybe you've already figured this out, but I had the same problem and the answers here got me most of the way to the solution I used.
The shortest answer to "how to add a range to select" is to pass the list of selected columns to the resultant dataframe of your rename operation:
df.rename(columns = {"Commander": "Com", "Score": "Sco"})[['Com', 'Sco']]
Com Sco
Cochice Jason 4
Pima Molly 24
Santa Cruz Tina 31
Maricopa Jake 2
Yuma Amy 3
But it's a little tedious to rewrite the column names, right? So you can initialize the rename with a dictionary:
selector_d = {'Commander': 'Com', 'Score': 'Sco'}
and pass that to the rename and select operations:
df.rename(columns=selector_d)[[*selector_d.values()]]
Com Sco
Cochice Jason 4
Pima Molly 24
Santa Cruz Tina 31
Maricopa Jake 2
Yuma Amy 3
My scenario was close to this - I had columns that I did not want to rename, but I did want to select them. This can be done by including the columns in the rename/select dictionary, but using the same name.
Here's the whole process with another column added:
data = {
'Commander': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'Date': ['2012, 02, 08', '2012, 02, 08', '2012, 02, 08',
'2012, 02, 08', '2012, 02, 08'],
'Score': [4, 24, 31, 2, 3],
'Team': ['Green', 'Yellow', 'Green', 'Yellow', 'Yellow'],
}
df = pd.DataFrame(data, index=['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df
Commander Date Score Team
Cochice Jason 2012, 02, 08 4 Green
Pima Molly 2012, 02, 08 24 Yellow
Santa Cruz Tina 2012, 02, 08 31 Green
Maricopa Jake 2012, 02, 08 2 Yellow
Yuma Amy 2012, 02, 08 3 Yellow
selector_d = {'Team': 'Team', 'Commander': 'Com', 'Score': 'Sco'}
df.rename(columns=selector_d)[[*selector_d.values()]]
Team Com Sco
Cochice Green Jason 4
Pima Yellow Molly 24
Santa Cruz Green Tina 31
Maricopa Yellow Jake 2
Yuma Yellow Amy 3
As you can see, this also allows reordering of the columns in the final dataframe.
Actually, you don't need the double brackets to select the columns from selector_d.values()
, as seen here:
df.rename(columns=selector_d)[[*selector_d.values()]].equals(
df.rename(columns=selector_d)[selector_d.values()]
)
True
So, df.rename(columns=selector_d)[selector_d.values()]
will suffice to select the new columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With