I have a dplyr question: How do I use transmute
over each column without writing each column out by hand? I.e. is there something like transmute_each()
?
I want to do the following: Using dplyr I want to get the z-score of each column for a MWE below:
tickers <- c(rep(1,10),rep(2,10))
df <- data.frame(cbind(tickers,rep(1:20),rep(2:21),rep(2:21),rep(4:23),rep(3:22)))
colnames(df) <- c("tickers","col1","col2","col3","col4","col5")
df %>% group_by(tickers)
Is there a simple way to then use transmute to achieve the following:
for(i in 2:ncol(df)){
df[,i] <- df[,i] - mean(df[,i])/sd(df[,i])
}
Many thanks
By using group_by() function from dplyr package we can perform group by on multiple columns or variables (two or more columns) and summarise on multiple columns for aggregations.
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.
Now that there is a transmute_at()
function (as of dplyr 0.7), you can do the following:
df %>%
group_by(tickers) %>%
transmute_at(.vars = vars(starts_with("col")),
.funs = funs(scale(.))) %>%
ungroup
Note that this uses the scale()
function from base R, which by default converts a numeric vector into a z-score.
Also, the use of vars()
in the .vars
argument allows you to use all the helper functions that are available for dplyr's select()
, such as one_of()
, ends_with()
, etc.
Finally, instead of writing funs(scale(.))
here, since you're using a simple function in the .funs
argument, you can just write .funs = scale
.
I solved this using the following:
df %>%
group_by(tickers) %>%
mutate_at(.funs = funs((. - mean(.))/sd(.)),
.cols = vars(matches("col")))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With