I am trying to obtain proportions within subsets of a data frame. For example, in this made-up data frame:
DF<-data.frame(category1=rep(c("A","B"),each=9),
category2=rep(rep(LETTERS[24:26],each=3),2),
animal=rep(c("dog","cat","mouse"),6),number=sample(18))
I would like like to calculate the proportion of each of the three animals for each category1
by category2
combination (e.g., out of all animals that are both "A" and "X", what proportion are dogs?). With prop.table
on column 4 of the data frame I can get the proportion that each row makes up of the total "number" column, but I have not found a way to do this for subsets based on category 1 and 2. I also tried splitting the data by category1
and category2
using this:
splitDF<-split(DF,list(DF$category1,DF$category2))
And I was hoping I could then apply a function with prop.table
to get the proportions of each animal within each split group, but I cannot get prop.table
working because I can't seem to specify which column of data to apply the function to within the split groups. Does anyone have any tips? Maybe this is possible with plyr
or something similar? I can't find anything in the help forums about ways to get proportions within subsets of data.
You can use function ddply()
from library plyr
to calculate proportions for each combination and then add new column to data frame.
library(plyr)
DF<-ddply(DF,.(category1,category2),transform,prop=number/sum(number))
DF
category1 category2 animal number prop
1 A X dog 17 0.44736842
2 A X cat 3 0.07894737
3 A X mouse 18 0.47368421
4 A Y dog 2 0.14285714
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With