When trying to count rows with similar 'kind' in data frame: <pre class="prettyprint"><code>import pandas as pd items = [('aaa','aaa text 1'), ('aaa','aaa text 2'), ('aaa','aaa text 3'), ('bb', 'bb text 1'), ('bb', 'bb text 2'), ('bb', 'bb text 3'), ('bb', 'bb text 4'), ('cccc','cccc text 1'), ('cccc','cccc text 2'), ('dd', 'dd text 1'), ('e', 'e text 1'), ('fff', 'fff text 1'), ] df = pd.DataFrame(items, columns=['kind', 'msg']) df kind msg 0 aaa aaa text 1 1 aaa aaa text 2 2 aaa aaa text 3 3 bb bb text 1 4 bb bb text 2 5 bb bb text 3 6 bb bb text 4 7 cccc cccc text 1 8 cccc cccc text 2 9 dd dd text 1 10 e e text 1 11 fff fff text 1 </code></pre> This code works: <pre class="prettyprint"><code>df = df[['kind']].groupby(['kind'])['kind'] \ .count() \ .reset_index(name='count') \ .sort_values(['count'], ascending=False) \ .head(5) df </code></pre> Resulting in: <pre class="prettyprint"><code> kind count 0 aaa 1 1 bb 1 2 cccc 1 3 dd 1 4 e 1 </code></pre> Yet, how can one get a data frame with all columns as in original one plus 'count' column? So the result should have columns 'kind', 'msg', 'count' in this order? Also, how to sort this resulting data frame in descending order of count?

IIUC <pre class="prettyprint"><code>In [247]: df['count'] = df.groupby('kind').transform('count') In [248]: df Out[248]: kind msg count 0 aaa aaa text 1 3 1 aaa aaa text 2 3 2 aaa aaa text 3 3 3 bb bb text 1 4 4 bb bb text 2 4 5 bb bb text 3 4 6 bb bb text 4 4 7 cccc cccc text 1 2 8 cccc cccc text 2 2 9 dd dd text 1 1 10 e e text 1 1 11 fff fff text 1 1 </code></pre> sorting: <pre class="prettyprint"><code>In [249]: df.sort_values('count', ascending=False) Out[249]: kind msg count 3 bb bb text 1 4 4 bb bb text 2 4 5 bb bb text 3 4 6 bb bb text 4 4 0 aaa aaa text 1 3 1 aaa aaa text 2 3 2 aaa aaa text 3 3 7 cccc cccc text 1 2 8 cccc cccc text 2 2 9 dd dd text 1 1 10 e e text 1 1 11 fff fff text 1 1 </code></pre>

Here is the simple code to count the frequencies and add a column to the data frame when grouping by the <code>kind</code> column. <pre class="prettyprint"><code>df['count'] = df.groupby('kind')['kind'].transform('count') </code></pre>

Pandas, group by count and add count to original dataframe?

Tags:

pandas

dataframe

When trying to count rows with similar 'kind' in data frame:

Click to copy

import pandas as pd

items = [('aaa','aaa text 1'), ('aaa','aaa text 2'), ('aaa','aaa text 3'),
         ('bb', 'bb text 1'), ('bb', 'bb text 2'), ('bb', 'bb text 3'), 
         ('bb', 'bb text 4'),
         ('cccc','cccc text 1'), ('cccc','cccc text 2'),
         ('dd', 'dd text 1'),
         ('e', 'e text 1'),
         ('fff', 'fff text 1'),
        ]

df = pd.DataFrame(items, columns=['kind', 'msg'])
df

    kind    msg
0   aaa     aaa text 1
1   aaa     aaa text 2
2   aaa     aaa text 3
3   bb      bb text 1
4   bb      bb text 2
5   bb      bb text 3
6   bb      bb text 4
7   cccc    cccc text 1
8   cccc    cccc text 2
9   dd      dd text 1
10  e       e text 1
11  fff     fff text 1

This code works:

Click to copy

df = df[['kind']].groupby(['kind'])['kind'] \
                         .count() \
                         .reset_index(name='count') \
                         .sort_values(['count'], ascending=False) \
                         .head(5)

df

Resulting in:

Click to copy

    kind      count
    0   aaa   1
    1   bb    1
    2   cccc  1
    3   dd    1
    4   e     1

Yet, how can one get a data frame with all columns as in original one plus 'count' column? So the result should have columns 'kind', 'msg', 'count' in this order?

Also, how to sort this resulting data frame in descending order of count?

609

asked Jul 27 '17 09:07

dokondr

2 Answers

IIUC

Click to copy

In [247]: df['count'] = df.groupby('kind').transform('count')

In [248]: df
Out[248]:
    kind          msg  count
0    aaa   aaa text 1      3
1    aaa   aaa text 2      3
2    aaa   aaa text 3      3
3     bb    bb text 1      4
4     bb    bb text 2      4
5     bb    bb text 3      4
6     bb    bb text 4      4
7   cccc  cccc text 1      2
8   cccc  cccc text 2      2
9     dd    dd text 1      1
10     e     e text 1      1
11   fff   fff text 1      1

sorting:

Click to copy

In [249]: df.sort_values('count', ascending=False)
Out[249]:
    kind          msg  count
3     bb    bb text 1      4
4     bb    bb text 2      4
5     bb    bb text 3      4
6     bb    bb text 4      4
0    aaa   aaa text 1      3
1    aaa   aaa text 2      3
2    aaa   aaa text 3      3
7   cccc  cccc text 1      2
8   cccc  cccc text 2      2
9     dd    dd text 1      1
10     e     e text 1      1
11   fff   fff text 1      1

111

answered Sep 19 '22 17:09

MaxU - stop WAR against UA

Here is the simple code to count the frequencies and add a column to the data frame when grouping by the kind column.

Click to copy

df['count'] = df.groupby('kind')['kind'].transform('count')

answered Sep 19 '22 17:09

Shubham Singh Chauhan

Related questions
                            
                                Subtracting columns based on key column in pandas dataframe
                            
                                Compare column names of Pandas Dataframe
                            
                                Divide DataFrame by first row
                            
                                Python Pandas - Removing Rows From A DataFrame Based on a Previously Obtained Subset
                            
                                How do I refer to the index of my Pandas dataframe?
                            
                                Apply styles while exporting to 'xlsx' in pandas with XlsxWriter
                            
                                Pandas: Always selecting the first sheet/tab in an Excel Sheet
                            
                                No module named 'pandas' in Pycharm
                            
                                Split pandas column and add last element to a new column
                            
                                Sort all columns of a dataframe
                            
                                pandas apply function with arguments
                            
                                How to export pandas data to elasticsearch?
                            
                                How to read a pandas Series from a CSV file
                            
                                Getting all rows with NaN value
                            
                                filter pandas dataframe by time
                            
                                Pandas and Rolling_Mean with Offset (Average Daily Volume Calculation)
                            
                                How to plot a jointplot with 'hue' parameter in seaborn
                            
                                quickest way to to convert list of tuples to a series
                            
                                Calculating the inverse of a matrix with pandas
                            
                                Pandas Melt with Multiple Value Vars

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas, group by count and add count to original dataframe?

Tags:

pandas

dataframe

dokondr

People also ask

2 Answers

MaxU - stop WAR against UA

Shubham Singh Chauhan

Recent Activity

Donate For Us