Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data wrangling to add columns that sum up counts of mapped values R

Tags:

dataframe

r

I have a data frame of counts per person that looks like this:

Person_ID Apple Pear Chicken Steak Spinach
   1        1    0      0      5      1
   2        1    1      1      0      0
   3        0    0      0      3      2

I have another dataframe that maps which food belongs to which food group and looks like this:

Food     Group
Apple    Fruit
Pear     Fruit
Chicken  Meat
Steak    Meat
Spinach  Vegetable

I want to use the 2nd dataframe to add new columns on the 1st, basically creating new columns representing the food groups and collecting the counts based on the sum of their constituent columns, so that the final output looks like this:

Person_ID Apple Pear Chicken Steak Spinach Fruit Meat Vegetable
   1        1    0      0      5      1      1    5       1
   2        1    1      1      0      0      2    1       0
   3        0    0      0      3      2      0    3       2

I am having trouble doing this in a clean way, and it seems quite complicated. I am wondering if there is a simple solution, and would appreciate advice on any solution at all

like image 881
user10156381 Avatar asked Dec 31 '22 13:12

user10156381


1 Answers

We just need assignment i.e. select the subset of columns of 'df1' with 'Food' column of 'df2', split those with 'Group' column into a list, get the rowSums and assign those to create new columns in 'df1' based on the 'Group' column values

m1 <- sapply(split.default(df1[df2$Food], df2$Group), rowSums)
df1[colnames(m1)] <- m1

-ouptut

df1
  Person_ID Apple Pear Chicken Steak Spinach Fruit Meat Vegetable
1         1     1    0       0     5       1     1    5         1
2         2     1    1       1     0       0     2    1         0
3         3     0    0       0     3       2     0    3         2

data

df1 <- structure(list(Person_ID = 1:3, Apple = c(1L, 1L, 0L), Pear = c(0L, 
1L, 0L), Chicken = c(0L, 1L, 0L), Steak = c(5L, 0L, 3L), Spinach = c(1L, 
0L, 2L)), class = "data.frame", row.names = c(NA, -3L))

df2 <- structure(list(Food = c("Apple", "Pear", "Chicken", "Steak", 
"Spinach"), Group = c("Fruit", "Fruit", "Meat", "Meat", "Vegetable"
)), class = "data.frame", row.names = c(NA, -5L))
like image 112
akrun Avatar answered May 01 '23 02:05

akrun