I have a data frame with NAs and I want to replace the NAs with row means
c1 = c(1,2,3,NA)
c2 = c(3,1,NA,3)
c3 = c(2,1,3,1)
df = data.frame(c1,c2,c3)
> df
c1 c2 c3
1 1 3 2
2 2 1 1
3 3 NA 3
4 NA 3 1
so that
> df
c1 c2 c3
1 1 3 2
2 2 1 1
3 3 3 3
4 2 3 1
Very similar to @baptiste's answer
> ind <- which(is.na(df), arr.ind=TRUE)
> df[ind] <- rowMeans(df, na.rm = TRUE)[ind[,1]]
I think this works,
df[which(is.na(df), arr.ind=TRUE)] <- rowMeans(df[!complete.cases(df), ], na.rm=TRUE)
Using apply
(note the returned object is a matrix
):
t( apply( df , 1 , function(x) { x[ is.na(x) ] = mean( x , na.rm = TRUE ); x } ) )
c1 c2 c3
[1,] 1 3 2
[2,] 2 1 1
[3,] 3 3 3
[4,] 2 3 1
We use any anonymous function to change the values of each NA
in each row to the mean
of that row. The only advantage is that you don't have to do any more typing if the number of rows increases. It is not particularly efficient or fast in a computational sense, but more so in a cognitive sense (you won't notice unless you have 000,000's of rows).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With