Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R sum of rows for different group of columns that start with similar string

Tags:

r

rowsum

I'm quite new to R and this is the first time I dare to ask a question here.

I'm working with a dataset with likert scales and I want to row sum over different group of columns which share the first strings in their name.

Below I constructed a data frame of only 2 rows to illustrate the approach I followed, though I would like to receive feedback on how I can write a more efficient way of doing it.

df <- as.data.frame(rbind(rep(sample(1:5),4),rep(sample(1:5),4)))

var.names <- c("emp_1","emp_2","emp_3","emp_4","sat_1","sat_2"
           ,"sat_3","res_1","res_2","res_3","res_4","com_1",
           "com_2","com_3","com_4","com_5","cap_1","cap_2",
           "cap_3","cap_4")

names(df) <- var.names

So, what I did, was to use the grep function in order to be able to sum the rows of the specified variables that started with certain strings and store them in a new variable. But I have to write a new line of code for each variable.

df$emp_t <- rowSums(df[, grep("\\bemp.", names(df))])
df$sat_t <- rowSums(df[, grep("\\bsat.", names(df))])
df$res_t <- rowSums(df[, grep("\\bres.", names(df))])
df$com_t <- rowSums(df[, grep("\\bcom.", names(df))])
df$cap_t <- rowSums(df[, grep("\\bcap.", names(df))])

But there is a lot more variables in the dataset and I would like to know if there is a way to do this with only one line of code. For example, some way to group the variables that start with the same strings together and then apply the row function.

Thanks in advance!

like image 308
csmontt Avatar asked May 21 '15 20:05

csmontt


1 Answers

One possible solution is to transpose df and calculate sums for the correct columns using base R rowsum function (using set.seed(123))

cbind(df, t(rowsum(t(df), sub("_.*", "_t", names(df)))))
#   emp_1 emp_2 emp_3 emp_4 sat_1 sat_2 sat_3 res_1 res_2 res_3 res_4 com_1 com_2 com_3 com_4 com_5 cap_1 cap_2 cap_3 cap_4 cap_t
# 1     2     4     5     3     1     2     4     5     3     1     2     4     5     3     1     2     4     5     3     1    13
# 2     1     3     4     2     5     1     3     4     2     5     1     3     4     2     5     1     3     4     2     5    14
#   com_t emp_t res_t sat_t
# 1    15    14    11     7
# 2    15    10    12     9
like image 89
David Arenburg Avatar answered Sep 26 '22 09:09

David Arenburg