I have a basic table of values:
import pandas as pd
import numpy as np
test = pd.read_csv('mean_test.csv')
test.replace('n/a',np.nan)
test
value1 value2 value3
1 9 5
5 NaN 4
9 55 NaN
NaN 4 9
I want to work out the average of the three values, ignoring NaN, so for the second row it would be (5+4)/2. Therefore I can't use the .replace function to put a zero in NaN's place. I have searched through some other questions, but can't find anything that covers this. Am I missing something obvious?
To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.
pandas mean() Key Points By default ignore NaN values and performs mean on index axis.
Using Dataframe. fillna() from the pandas' library, we can easily replace the 'NaN' in the data frame. Procedure: To calculate the mean() we use the mean function of the particular column. Now with the help of fillna() function we will change all 'NaN' of that particular column for which we have its mean.
We can use fillna() function to impute the missing values of a data frame to every column defined by a dictionary of values. The limitation of this method is that we can only use constant values to be filled.
Pandas takes care of the NaN
for you:
>>> df
value1 value2 value3
0 1 9 5
1 5 NaN 4
2 9 55 NaN
3 NaN 4 9
>>> df.mean(axis=1)
0 5.0
1 4.5
2 32.0
3 6.5
dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With