I have a pandas dataframe like this:
  character  count
0         a    104
1         b     30
2         c    210
3         d     40
4         e    189
5         f     20
6         g     10
I want to have only the top 3 characters in the dataframe and the remaining are combined as others so table become:
  character  count
0         c    210
1         e    189
2         a    104
3    others    100
How can I achieve this?
Thank you.
The concat() function in pandas is used to append either columns or rows from one DataFrame to another.
Concatenate Two Columns Using + Operator in pandas By use + operator simply you can concatenate two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.
merge() for combining data on common columns or indices. . join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.
You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.
we can use Series.nlargest() method:
In [31]: new = df.nlargest(3, columns='count')
In [32]: new = pd.concat(
    ...:         [new,
    ...:          pd.DataFrame({'character':['others'],
    ...:                        'count':df.drop(new.index)['count'].sum()})
    ...:         ], ignore_index=True)
    ...:
In [33]: new
Out[33]:
  character  count
0         c    210
1         e    189
2         a    104
3    others     60
or bit less idiomatic solution:
In [16]: new = df.nlargest(3, columns='count')
In [17]: new.loc[len(new)] = ['others', df.drop(new.index)['count'].sum()]
In [18]: new
Out[18]:
  character  count
2         c    210
4         e    189
0         a    104
3    others    100
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With