I have a data frame which contains several variables which got measured at different time points (e.g., test1_tp1
, test1_tp2
, test1_tp3
, test2_tp1
, test2_tp2
,...).
I am now trying to use dplyr
to add a new column to a data frame that calculates the row wise mean over a selection of these columns (e.g., mean over all time points for test1
).
data %>% ... %>% mutate(test1_mean = mean(test1_tp1, test1_tp2, test1_tp3, na.rm = TRUE)
data %>% ... %>% mutate(test1_mean = mean(matches("test1_.*"), na.rm = TRUE)
Syntax: mutate(new-col-name = rowSums(.)) The rowSums() method is used to calculate the sum of each row and then append the value at the end of each row under the new column name specified. The argument . is used to apply the function over all the cells of the data frame. Syntax: rowSums(.)
To find the mean of multiple columns based on multiple grouping columns in R data frame, we can use summarise_at function with mean function.
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist. Most dplyr verbs preserve row-wise grouping. The exception is summarise() , which return a grouped_df.
You can use starts_with
inside select
to find all columns starting with a certain string.
data %>%
mutate(test1 = select(., starts_with("test1_")) %>%
rowMeans(na.rm = TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With