Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

subtract mean from every element dplyr

Tags:

r

scale

dplyr

I want to demean all my columns using dplyr. I tried but failed using the "do()" command.

I basically want to replicate the following using easier dplyr commands:

tickers <- c(rep(1,10),rep(2,10))
df <- data.frame(cbind(tickers,rep(1:20),rep(2:21)))
colnames(df) <- c("tickers","col1","col2")
df %>%  group_by(tickers)
apply(df[,2:3],2,function(x) x - mean(x))

I am sure this can be done much better using dplyr.

Thanks!

like image 765
Nick Avatar asked Sep 21 '15 09:09

Nick


Video Answer


1 Answers

If we are using dplyr, we can do this with mutate_each and use any of the methods mentioned in ?select to match the columns. Here, I am using matches which can take regular expression as pattern.

library(dplyr)
df %>%
    mutate_each(funs(.-mean(.)), matches('^col')) %>%
    select(-tickers)

But this can be done also using base R:

df[2:3]-colMeans(df[2:3])[col(df[2:3])]

The colMeans output is a vector which can be replicated so that the lengths will be the same.

like image 162
akrun Avatar answered Oct 15 '22 04:10

akrun