Calculate mean for selected rows for selected columns in pandas data frame

Tags:

I have pandas df with say, 100 rows, 10 columns, (actual data is huge). I also have row_index list which contains, which rows to be considered to take mean. I want to calculate mean on say columns 2,5,6,7 and 8. Can we do it with some function for dataframe object?

What I know is do a for loop, get value of row for each element in row_index and keep doing mean. Do we have some direct function where we can pass row_list, and column_list and axis, for ex df.meanAdvance(row_list,column_list,axis=0) ?

I have seen DataFrame.mean() but it didn't help I guess.

  a b c d q  0 1 2 3 0 5 1 1 2 3 4 5 2 1 1 1 6 1 3 1 0 0 0 0

I want mean of 0, 2, 3 rows for each a, b, d columns

  a b d 0 1 1 2

804

asked Apr 06 '16 14:04

impossible

1 Answers

To select the rows of your dataframe you can use iloc, you can then select the columns you want using square brackets.

For example:

 df = pd.DataFrame(data=[[1,2,3]]*5, index=range(3, 8), columns = ['a','b','c'])

gives the following dataframe:

   a  b  c 3  1  2  3 4  1  2  3 5  1  2  3 6  1  2  3 7  1  2  3

to select only the 3d and fifth row you can do:

df.iloc[[2,4]]

which returns:

   a  b  c 5  1  2  3 7  1  2  3

if you then want to select only columns b and c you use the following command:

df[['b', 'c']].iloc[[2,4]]

which yields:

   b  c 5  2  3 7  2  3

To then get the mean of this subset of your dataframe you can use the df.mean function. If you want the means of the columns you can specify axis=0, if you want the means of the rows you can specify axis=1

thus:

df[['b', 'c']].iloc[[2,4]].mean(axis=0)

returns:

b    2 c    3

As we should expect from the input dataframe.

For your code you can then do:

 df[column_list].iloc[row_index_list].mean(axis=0)

EDIT after comment: New question in comment: I have to store these means in another df/matrix. I have L1, L2, L3, L4...LX lists which tells me the index whose mean I need for columns C[1, 2, 3]. For ex: L1 = [0, 2, 3] , means I need mean of rows 0,2,3 and store it in 1st row of a new df/matrix. Then L2 = [1,4] for which again I will calculate mean and store it in 2nd row of the new df/matrix. Similarly till LX, I want the new df to have X rows and len(C) columns. Columns for L1..LX will remain same. Could you help me with this?

Answer:

If i understand correctly, the following code should do the trick (Same df as above, as columns I took 'a' and 'b':

first you loop over all the lists of rows, collection all the means as pd.series, then you concatenate the resulting list of series over axis=1, followed by taking the transpose to get it in the right format.

dfs = list() for l in L:     dfs.append(df[['a', 'b']].iloc[l].mean(axis=0))  mean_matrix = pd.concat(dfs, axis=1).T

157

answered Oct 23 '22 20:10

PdevG

Related questions
                            
                                Using Spark to write a parquet file to s3 over s3a is very slow
                            
                                async/await and opening a FileStream?
                            
                                Hashing an array in c#
                            
                                Filter array of objects that contains string using lodash
                            
                                How to unit test if function output list of dictionaries?
                            
                                Angular2 router, get route data from url, to display breadcrumbs
                            
                                Angular 2: How to conditionally load a Component in a Route asynchronously?
                            
                                Compile-time or runtime detection within a constexpr function
                            
                                How do I enable auto code formatting for flake8 in PyCharm
                            
                                When creating a composer package, what "Package Type" should I choose
                            
                                Mapbox GL JS: Export map to PNG or PDF?
                            
                                Using an object after std::move doesn't result in a compilation error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With