Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get the average (mean) of selected columns

Tags:

r

I would like to get the average for certain columns for each row.

I have this data:

w=c(5,6,7,8) x=c(1,2,3,4) y=c(1,2,3) length(y)=4 z=data.frame(w,x,y) 

Which returns:

  w x  y 1 5 1  1 2 6 2  2 3 7 3  3 4 8 4 NA 

I would like to get the mean for certain columns, not all of them. My problem is that there are a lot of NAs in my data. So if I wanted the mean of x and y, this is what I would like to get back:

  w x  y mean 1 5 1  1    1 2 6 2  2    2 3 7 3  3    3 4 8 4 NA    4 

I guess I could do something like z$mean=(z$x+z$y)/2 but the last row for y is NA so obviously I do not want the NA to be calculated and I should not be dividing by two. I tried cumsum but that returns NAs when there is a single NA in that row. I guess I am looking for something that will add the selected columns, ignore the NAs, get the number of selected columns that do not have NAs and divide by that number. I tried ??mean and ??average and am completely stumped.

ETA: Is there also a way I can add a weight to a specific column?

like image 820
thequerist Avatar asked Feb 28 '12 22:02

thequerist


People also ask

How do you find the mean of all columns?

To calculate the mean of whole columns in the DataFrame, use pandas. Series. mean() with a list of DataFrame columns. You can also get the mean for all numeric columns using DataFrame.

How do you calculate the mean for a specific column in R?

ColMeans() Function along with sapply() is used to get the mean of the multiple column. Dataframe is passed as an argument to ColMeans() Function. Mean of numeric columns of the dataframe is calculated.

How do I get the mean of multiple columns in R?

To find the mean of multiple columns based on multiple grouping columns in R data frame, we can use summarise_at function with mean function.


2 Answers

Here are some examples:

> z$mean <- rowMeans(subset(z, select = c(x, y)), na.rm = TRUE) > z   w x  y mean 1 5 1  1    1 2 6 2  2    2 3 7 3  3    3 4 8 4 NA    4 

weighted mean

> z$y <- rev(z$y) > z   w x  y mean 1 5 1 NA    1 2 6 2  3    2 3 7 3  2    3 4 8 4  1    4 >  > weight <- c(1, 2) # x * 1/3 + y * 2/3 > z$wmean <- apply(subset(z, select = c(x, y)), 1, function(d) weighted.mean(d, weight, na.rm = TRUE)) > z   w x  y mean    wmean 1 5 1 NA    1 1.000000 2 6 2  3    2 2.666667 3 7 3  2    3 2.333333 4 8 4  1    4 2.000000 
like image 66
kohske Avatar answered Sep 27 '22 17:09

kohske


Try using rowMeans:

z$mean=rowMeans(z[,c("x", "y")], na.rm=TRUE)    w x  y mean 1 5 1  1    1 2 6 2  2    2 3 7 3  3    3 4 8 4 NA    4 
like image 32
Andrew Avatar answered Sep 27 '22 17:09

Andrew