I have a dataframe:
levels counts
1, 2, 2 24
1, 2 20
1, 3, 3, 3 15
1, 3 10
1, 2, 3 25
I want to treat, for example, "1, 2, 2" and "1, 2" as the same thing. So, as long as there is a "1" and "2" without any other character, it will count as the level "1, 2". Here is the desired data frame:
levels counts
1, 2 44
1, 3 25
1, 2, 3 25
Here is code to reproduce the original data frame:
df <- data.frame(levels = c("1, 2, 2", "1, 2", "1, 3, 3, 3", "1, 3", "1, 2, 3"),
counts = c(24, 20, 15, 10, 25))
df$levels <- as.character(df$levels)
Split df$levels
, get the unique elements, and then sort it. Then use that to obtain aggregate of counts
.
df$levels2 = sapply(strsplit(df$levels, ", "), function(x)
paste(sort(unique(x)), collapse = ", ")) #Or toString(sort(unique(x))))
aggregate(counts~levels2, df, sum)
# levels2 counts
#1 1, 2 44
#2 1, 2, 3 25
#3 1, 3 25
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With