I want to demean all my columns using dplyr. I tried but failed using the "do()" command.
I basically want to replicate the following using easier dplyr commands:
tickers <- c(rep(1,10),rep(2,10))
df <- data.frame(cbind(tickers,rep(1:20),rep(2:21)))
colnames(df) <- c("tickers","col1","col2")
df %>% group_by(tickers)
apply(df[,2:3],2,function(x) x - mean(x))
I am sure this can be done much better using dplyr.
Thanks!
If we are using dplyr
, we can do this with mutate_each
and use any of the methods mentioned in ?select
to match the columns. Here, I am using matches
which can take regular expression as pattern.
library(dplyr)
df %>%
mutate_each(funs(.-mean(.)), matches('^col')) %>%
select(-tickers)
But this can be done also using base R
:
df[2:3]-colMeans(df[2:3])[col(df[2:3])]
The colMeans
output is a vector
which can be replicated so that the lengths will be the same.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With