Pandas Groupby: Count and mean combined

Tags:

Working with pandas to try and summarise a data frame as a count of certain categories, as well as the means sentiment score for these categories.

There is a table full of strings that have different sentiment scores, and I want to group each text source by saying how many posts they have, as well as the average sentiment of these posts.

My (simplified) data frame looks like this:

source    text              sent -------------------------------- bar       some string       0.13 foo       alt string        -0.8 bar       another str       0.7 foo       some text         -0.2 foo       more text         -0.5

The output from this should be something like this:

source    count     mean_sent ----------------------------- foo       3         -0.5 bar       2         0.415

The answer is somewhere along the lines of:

df['sent'].groupby(df['source']).mean()

Yet only gives each source and it's mean, with no column headers.

502

asked Dec 08 '16 12:12

Lewis Anderson

1 Answers

You can use groupby with aggregate:

df = df.groupby('source') \        .agg({'text':'size', 'sent':'mean'}) \        .rename(columns={'text':'count','sent':'mean_sent'}) \        .reset_index() print (df)   source  count  mean_sent 0    bar      2      0.415 1    foo      3     -0.500

121

answered Sep 22 '22 18:09

jezrael

Related questions
                            
                                Why were literal formatted strings (f-strings) so slow in Python 3.6 alpha? (now fixed in 3.6 stable)
                            
                                python theading.Timer: how to pass argument to the callback?
                            
                                Python set datetime hour to be a specific time
                            
                                Jupyter notebook command does not work on Mac
                            
                                Why are default arguments evaluated at definition time?
                            
                                How to use OR using Django's model filter system?
                            
                                Assigning to columns in NumPy?
                            
                                Tkinter example code for multiple windows, why won't buttons load correctly?
                            
                                Perform a reverse cumulative sum on a numpy array
                            
                                Search python docs offline?
                            
                                Pycharm's code style inspection: ignore/switch off specific rules
                            
                                How to select a range of values in a pandas dataframe column?
                            
                                pandas combine two columns with null values
                            
                                python write string directly to tarfile
                            
                                A good way to escape quotes in a database query string?
                            
                                Python sort() method on list vs builtin sorted() function
                            
                                Deleting read-only directory in Python
                            
                                Get first list index containing sub-string?
                            
                                PyCharm does not recognize modules installed in development mode
                            
                                Python pandas integer YYYYMMDD to datetime

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas Groupby: Count and mean combined

Tags:

python

pandas

dataframe

group-by

python-2.7

Lewis Anderson

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us