Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

calculating averages of multiple columns, ignoring NaN pandas numpy

Tags:

python

pandas

I have a basic table of values:

import pandas as pd
import numpy as np
test = pd.read_csv('mean_test.csv')
test.replace('n/a',np.nan)
test


value1  value2  value3
1   9   5
5   NaN 4
9   55  NaN
NaN 4   9

I want to work out the average of the three values, ignoring NaN, so for the second row it would be (5+4)/2. Therefore I can't use the .replace function to put a zero in NaN's place. I have searched through some other questions, but can't find anything that covers this. Am I missing something obvious?

like image 902
DGraham Avatar asked Dec 27 '15 15:12

DGraham


People also ask

How do you average multiple columns in pandas?

To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.

Does mean in pandas ignore NaN?

pandas mean() Key Points By default ignore NaN values and performs mean on index axis.

How replace NaN values with average column in pandas?

Using Dataframe. fillna() from the pandas' library, we can easily replace the 'NaN' in the data frame. Procedure: To calculate the mean() we use the mean function of the particular column. Now with the help of fillna() function we will change all 'NaN' of that particular column for which we have its mean.

How do you impute multiple columns in Python?

We can use fillna() function to impute the missing values of a data frame to every column defined by a dictionary of values. The limitation of this method is that we can only use constant values to be filled.


1 Answers

Pandas takes care of the NaN for you:

>>> df
value1  value2  value3
0       1       9       5
1       5     NaN       4
2       9      55     NaN
3     NaN       4       9

>>> df.mean(axis=1)
0     5.0
1     4.5
2    32.0
3     6.5
dtype: float64
like image 165
Mike Müller Avatar answered Sep 30 '22 00:09

Mike Müller