how do I search for columns with same name, add the column values and replace these columns with same name by their sum? Using R

Question

I have a data frame where some consecutive columns have the same name. I need to search for these, add their values in for each row, drop one column and replace the other with their sum. without previously knowing which patterns are duplicated, possibly having to compare one column name with the following to see if there's a match.

Can someone help?

Thanks in advance.

Can someone help?

Thanks in advance.

IRTFM · Accepted Answer

> dfrm <- data.frame(a = 1:10, b= 1:10, cc= 1:10, dd=1:10, ee=1:10)
> names(dfrm) <- c("a", "a", "b", "b", "b")
> sapply(unique(names(dfrm)[duplicated(names(dfrm))]), 
      function(x) Reduce("+", dfrm[ , grep(x, names(dfrm))]) )
       a  b
 [1,]  2  3
 [2,]  4  6
 [3,]  6  9
 [4,]  8 12
 [5,] 10 15
 [6,] 12 18
 [7,] 14 21
 [8,] 16 24
 [9,] 18 27
[10,] 20 30

EDIT 2: Using rowSums allows simplification of the first sapply argumentto just unique(names(dfrm)) at the expense of needing to remember to include drop=FALSE in "[":

sapply(unique(names(dfrm)), 
       function(x) rowSums( dfrm[ , grep(x, names(dfrm)), drop=FALSE]) )

To deal with NA's:

sapply(unique(names(dfrm)), 
      function(x) apply(dfrm[grep(x, names(dfrm))], 1, 
              function(y) if ( all(is.na(y)) ) {NA} else { sum(y, na.rm=TRUE) }
       )               )

(Edit note: addressed Tommy counter-example by putting unique around the names(.)[.] construction. The erroneous code was:

sapply(names(dfrm)[unique(duplicated(names(dfrm)))], 
     function(x) Reduce("+", dfrm[ , grep(x, names(dfrm))]) )

Ramnath · Answer

Here is my one liner

# transpose data frame, sum by group = rowname, transpose back.
t(rowsum(t(dfrm), group = rownames(t(dfrm))))

how do I search for columns with same name, add the column values and replace these columns with same name by their sum? Using R

Tags:

r

Assu

2 Answers

IRTFM

Ramnath

Recent Activity

Donate For Us

how do I search for columns with same name, add the column values and replace these columns with same name by their sum? Using R

Tags:

r

Assu

2 Answers

IRTFM

Ramnath

Related questions

Recent Activity

Donate For Us