I have a pandas dataframe like this:
character count
0 a 104
1 b 30
2 c 210
3 d 40
4 e 189
5 f 20
6 g 10
I want to have only the top 3 characters in the dataframe and the remaining are combined as others
so table become:
character count
0 c 210
1 e 189
2 a 104
3 others 100
How can I achieve this?
Thank you.
The concat() function in pandas is used to append either columns or rows from one DataFrame to another.
Concatenate Two Columns Using + Operator in pandas By use + operator simply you can concatenate two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.
merge() for combining data on common columns or indices. . join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.
You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.
we can use Series.nlargest() method:
In [31]: new = df.nlargest(3, columns='count')
In [32]: new = pd.concat(
...: [new,
...: pd.DataFrame({'character':['others'],
...: 'count':df.drop(new.index)['count'].sum()})
...: ], ignore_index=True)
...:
In [33]: new
Out[33]:
character count
0 c 210
1 e 189
2 a 104
3 others 60
or bit less idiomatic solution:
In [16]: new = df.nlargest(3, columns='count')
In [17]: new.loc[len(new)] = ['others', df.drop(new.index)['count'].sum()]
In [18]: new
Out[18]:
character count
2 c 210
4 e 189
0 a 104
3 others 100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With