I have two data frames df1 and df2:
group=c("Group 1", "Group 2", "Group3","Group 1", "Group 2", "Group3")
year=c("2000","2000","2000", "2015", "2015", "2015")
items=c("12", "10", "15", "5", "10", "7")
df1=data.frame(group, year, items)
year=c("2000", "2015")
items=c("37", "22")
df2=data.frame(year,items)
df1 contains the number of items per year and separated by group, and df2 contains the total number of items per year
I'm trying to create a for loop that will calculate the proportion of items for each group type. I'm trying to do something like:
df1$Prop="" #create empty column called Prop in df1
for(i in 1:nrow(df1)){
df1$Prop[i]=df1$items/df2$items[df2$year==df1$year[i]]
}
where the loop is supposed to get the proportion for each type of item (by getting the value from df1 and dividing by the total in df2) and list it in a new column but this code isn't working.
You don't need df2 really, here's a simple solution using data.table and only df1 (I'm assuimg items is numeric column, if not, you''ll need to convert it to one setDT(df1)[, items := as.numeric(as.character(items))])
library(data.table)
setDT(df1)[, Prop := items/sum(items), by = year]
df1
# group year items Prop
# 1: Group 1 2000 12 0.3243243
# 2: Group 2 2000 10 0.2702703
# 3: Group3 2000 15 0.4054054
# 4: Group 1 2015 5 0.2272727
# 5: Group 2 2015 10 0.4545455
# 6: Group3 2015 7 0.3181818
Another way is if you already have df2, you can join between the two and calculate Prop while doing so (again, I'm assuming items is numeric in real data)
setkey(setDT(df1), year)[df2, Prop := items/i.items]
A base R alternative
with(df1, ave(items, year, FUN = function(x) x/sum(x)))
## [1] 0.3243243 0.2702703 0.4054054 0.2272727 0.4545455 0.3181818
dplyr equivalent to David's data.table solution
library(dplyr)
df1$items = as.integer(as.vector(df1$items))
df1 %>% group_by(year) %>% mutate(Prop = items / sum(items))
#Source: local data frame [6 x 4]
#Groups: year
# group year items Prop
#1 Group 1 2000 12 0.3243243
#2 Group 2 2000 10 0.2702703
#3 Group3 2000 15 0.4054054
#4 Group 1 2015 5 0.2272727
#5 Group 2 2015 10 0.4545455
#6 Group3 2015 7 0.3181818
plyr alternative
ddply(df1, .(year), mutate, prop = items/sum(items))
lapply alternative
do.call(rbind,lapply(split(df1, df1$year),
function(x){ x$prop = x$item / sum(x$item); x}))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With