Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group variables in a dataframe R using a specific list

I have the following lists:

  group1<-c("A", "B", "D")
  group2<-c("C", "E")
  group3<-c("F")

and a dataframe with values and corresponding names:

  df <- data.frame (name=c("A","B","C","D","E","F"),value=c(1,2,3,4,5,6))
  df
    name value
  1    A     1
  2    B     2
  3    C     3
  4    D     4
  5    E     5
  6    F     6

I'd like to group the data based on the lists, using the name column;

  df
    name value    group
  1    A     1   group1
  2    B     2   group1
  3    C     3   group2
  4    D     4   group1
  5    E     5   group2
  6    F     6   group3

and sum the values for each group.

  df
       group sum
  1   group1   7
  2   group2   8
  3   group3   6

I've searched for similar posts, but failed using them for my problem.

like image 779
user2904120 Avatar asked Oct 03 '22 08:10

user2904120


1 Answers

Here's an approach. First, use ifelse to assign groups to each name, then use aggregate to get the sum for each group.

> df$group <- with(df, ifelse(name %in% group1, "group1",
                              ifelse(name %in% group2, "group2", "group3" )))
> aggregate(value ~ group, sum, data=df)
   group value
1 group1     7
2 group2     8
3 group3     6
like image 57
Jilber Urbina Avatar answered Oct 09 '22 08:10

Jilber Urbina