Pandas: Groupby to create table with count and count values

Tags:

My objective is simple but not sure if it's possible. Reproducible example:

Can you go from this:

raw_data = {'score': [1, 3, 4, 4, 1, 2, 2, 4, 4, 2],
        'player': ['Miller', 'Jacobson', 'Ali', 'George', 'Cooze', 'Wilkinson', 'Lewis', 'Lewis', 'Lewis', 'Jacobson']}
df = pd.DataFrame(raw_data, columns = ['score', 'player'])
df

    score   player
0   1       Miller
1   3       Jacobson
2   4       Ali
3   4       George
4   1       Cooze
5   2       Wilkinson
6   2       Lewis
7   4       Lewis
8   4       Lewis
9   2       Jacobson

To this:

        score    col_1       col_2       col_3       col_4     
score   
1       2        Miller      Cooze       n/a         n/a
2       3        Wilkinson   Lewis       Jacobson    n/a
3       1        Jacobson    n/a         n/a         n/a
4       4        Ali         George      Lewis       Lewis

Via a groupby?

I can get this far df.groupby(['score']).agg({'score': np.size}) but can't work out how to create the new columns with the column values.

205

asked May 30 '17 20:05

RDJ

1 Answers

I can duplicate your output with

Option 1

g = df.groupby('score').player
g.size().to_frame('score').join(g.apply(list).apply(pd.Series).add_prefix('col_'))

       score      col_0   col_1     col_2  col_3
score                                           
1          2     Miller   Cooze       NaN    NaN
2          3  Wilkinson   Lewis  Jacobson    NaN
3          1   Jacobson     NaN       NaN    NaN
4          4        Ali  George     Lewis  Lewis

Option 2

d1 = df.groupby('score').agg({'score': 'size', 'player': lambda x: tuple(x)})
d1.join(pd.DataFrame(d1.pop('player').values.tolist()).add_prefix('col_'))

       score      col_0   col_1     col_2  col_3
score                                           
1          2     Miller   Cooze       NaN    NaN
2          3  Wilkinson   Lewis  Jacobson    NaN
3          1   Jacobson     NaN       NaN    NaN
4          4        Ali  George     Lewis  Lewis

126

answered Sep 21 '22 11:09

piRSquared

Related questions
                            
                                Converting svg from Highcharts data into data points
                            
                                Adding gravity to a bouncing ball using vectors
                            
                                Multiply all rows in a Pandas DataFrame by dictionary
                            
                                How to run python-socketio in Thread?
                            
                                Pandas - merging dataframes conditionally on multiple columns
                            
                                Other option for colored scrollbar in tkinter based program?
                            
                                Using collections.Counter to count emojis with different colors
                            
                                PySpark sampleBy using multiple columns
                            
                                How do I just keep the rows with the maximum value in a column for items of the same type? [duplicate]
                            
                                python Get the unique values from a dictionary
                            
                                Create a dataframe from a dict where values are variable-length lists
                            
                                Are there more than three types of methods in Python?
                            
                                How to define default argument value based on previous arguments?
                            
                                Pandas reading NULL as a NaN float instead of str [duplicate]
                            
                                How can I upload a 'file' to S3 by creating a temp file, using AWS Lambda?
                            
                                How to get all combination from multiple lists?
                            
                                __call__ method of type class
                            
                                Scipy.optimize.curve_fit won't fit cosine power law
                            
                                Parsing a .proto file without creating the descriptor
                            
                                Save panda boxplot as image

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas: Groupby to create table with count and count values

Tags:

python

pandas

pandas-groupby

RDJ

People also ask

1 Answers

piRSquared

Recent Activity

Donate For Us