Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How to add specific columns of .mean to dataframe

How can I add the means of b and c to my dataframe? I tried a merge but it didn't seem to work. So I want two extra columns b_mean and c_mean added to my dataframe with the results of df.groupBy('date').mean()

DataFrame

  a  b  c  date
0  2  3  5     1
1  5  9  1     1
2  3  7  1     1

I have the following code

import pandas as pd

a = [{'date': 1,'a':2, 'b':3, 'c':5}, {'date':1, 'a':5, 'b':9, 'c':1}, {'date':1, 'a':3, 'b':7, 'c':1}]

df = pd.DataFrame(a)

x =  df.groupby('date').mean()

Edit:

Desired output would be the following df.groupby('date').mean() returns:

             a         b         c
date                              
1     3.333333  6.333333  2.333333

My desired result would be the following data frame

   a  b  c  date  a_mean   b_mean
0  2  3  5     1  3.3333   6.3333
1  5  9  1     1  3.3333   6.3333 
2  3  7  1     1  3.3333   6.3333
like image 824
John Decker Avatar asked Mar 26 '17 22:03

John Decker


People also ask

How do I add a specific column to a DataFrame in Python?

Answer. Yes, you can add a new column in a specified position into a dataframe, by specifying an index and using the insert() function. By default, adding a column will always add it as the last column of a dataframe. This will insert the column at index 2, and fill it with the data provided by data .

How do I append specific columns in a data frame?

Here are two commands which can be used: Use Dataframe join command to append the columns. Use Pandas concat command to append the columns. Both methods can be used to join multiple columns from different data frames and create one data frame.

How do I get only certain columns from a data frame?

To select a single column, use square brackets [] with the column name of the column of interest.

How do you add a column of values to a DataFrame in pandas?

You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])


1 Answers

As @ayhan mentioned, you can use pd.groupby.transform() for this. Transform is like apply, but it uses the same index as the original dataframe instead of the unique values in the column(s) grouped on.

df['a_mean'] = df.groupby('date')['a'].transform('mean')
df['b_mean'] = df.groupby('date')['b'].transform('mean')

>>> df
   a  b  c  date    b_mean    a_mean
0  2  3  5     1  6.333333  3.333333
1  5  9  1     1  6.333333  3.333333
2  3  7  1     1  6.333333  3.333333
like image 137
3novak Avatar answered Sep 23 '22 12:09

3novak