Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

combining values in rows based on matching conditions in R

Tags:

dataframe

r

I have a simple question about aggregating values in R.

Suppose I have a dataframe:

DF <- data.frame(col1=c("Type 1", "Type 1B", "Type 2"), col2=c(1, 2, 3))  

which looks like this:

     col1 col2
1  Type 1    1
2 Type 1B    2
3  Type 2    3

I notice that I have Type 1 and Type 1B in the data, so I would like to combine Type 1B into Type 1.

So I decide to use dplyr:

filter(DF, col1=='Type 1' | col1=='Type 1B') %>%
  summarise(n = sum(col2))

But now I need to keep going with it:

DF2 <- data.frame('Type 1', filter(DF, col1=='Type 1' | col1=='Type 1B') %>%
  summarise(n = sum(col2)))

I guess I want to cbind this new DF2 back to the original DF, but that means I have to set the column names to be consistent:

names(DF2) <- c('col1', 'col2')

OK, now I can rbind:

rbind(DF2, DF[3,])

The result? It worked....

   col1 col2
1 Type 1    3
3 Type 2    3

...but ugh! That was awful! There has to be a better way to simply combine values.

like image 939
Monica Heddneck Avatar asked Apr 07 '15 20:04

Monica Heddneck


1 Answers

Here's a possible dplyr approach:

library(dplyr)
DF %>%
  group_by(col1 = sub("(.*\\d+).*$", "\\1", col1)) %>%
  summarise(col2 = sum(col2))
#Source: local data frame [2 x 2]
#
#    col1 col2
#1 Type 1    3
#2 Type 2    3
like image 79
talat Avatar answered Oct 19 '22 18:10

talat