Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Create new column in data frame using a for loop to calculate value in R?




I have two data frames df1 and df2:

group=c("Group 1", "Group 2", "Group3","Group 1", "Group 2", "Group3")
year=c("2000","2000","2000", "2015", "2015", "2015")
items=c("12", "10", "15", "5", "10", "7")
df1=data.frame(group, year, items)

year=c("2000", "2015")
items=c("37", "22")

df1 contains the number of items per year and separated by group, and df2 contains the total number of items per year

I'm trying to create a for loop that will calculate the proportion of items for each group type. I'm trying to do something like:

df1$Prop="" #create empty column called Prop in df1
for(i in 1:nrow(df1)){

where the loop is supposed to get the proportion for each type of item (by getting the value from df1 and dividing by the total in df2) and list it in a new column but this code isn't working.

like image 994
shrimp32 Avatar asked Jul 13 '15 22:07


2 Answers

You don't need df2 really, here's a simple solution using data.table and only df1 (I'm assuimg items is numeric column, if not, you''ll need to convert it to one setDT(df1)[, items := as.numeric(as.character(items))])

setDT(df1)[, Prop := items/sum(items), by = year]
#      group year items      Prop
# 1: Group 1 2000    12 0.3243243
# 2: Group 2 2000    10 0.2702703
# 3:  Group3 2000    15 0.4054054
# 4: Group 1 2015     5 0.2272727
# 5: Group 2 2015    10 0.4545455
# 6:  Group3 2015     7 0.3181818

Another way is if you already have df2, you can join between the two and calculate Prop while doing so (again, I'm assuming items is numeric in real data)

setkey(setDT(df1), year)[df2, Prop := items/i.items]

A base R alternative

with(df1, ave(items, year, FUN = function(x) x/sum(x)))
## [1] 0.3243243 0.2702703 0.4054054 0.2272727 0.4545455 0.3181818
like image 143
David Arenburg Avatar answered Sep 20 '22 17:09

David Arenburg

dplyr equivalent to David's data.table solution


df1$items = as.integer(as.vector(df1$items))
df1 %>% group_by(year) %>% mutate(Prop = items / sum(items))

#Source: local data frame [6 x 4]
#Groups: year

#    group year items      Prop
#1 Group 1 2000    12 0.3243243
#2 Group 2 2000    10 0.2702703
#3  Group3 2000    15 0.4054054
#4 Group 1 2015     5 0.2272727
#5 Group 2 2015    10 0.4545455
#6  Group3 2015     7 0.3181818

plyr alternative

ddply(df1, .(year), mutate, prop = items/sum(items))

lapply alternative

do.call(rbind,lapply(split(df1, df1$year), 
        function(x){ x$prop = x$item / sum(x$item); x}))
like image 35
Veerendra Gadekar Avatar answered Sep 20 '22 17:09

Veerendra Gadekar