I have a minimal example of a data set D that looks something like:
score person freq
10 1 3
10 2 5
10 3 4
8 1 3
7 2 2
6 4 1
Now, I want to be able to plot frequency of score=10 against person.
However, if I do:
#My bad, turns out the next line only works for matrices anyway:
#D = D[which(D[,1] == 10)]
D = subset(D, score == 10)
then I get:
score person freq
10 1 3
10 2 5
10 3 4
However, this is what I would like to get:
score person freq
10 1 3
10 2 5
10 3 4
10 4 0
Is there any quick and painless way for me to do this in R?
Here's a base R approach:
subset(as.data.frame(xtabs(freq ~ score + person, df)), score == 10)
# score person Freq
#4 10 1 3
#8 10 2 5
#12 10 3 4
#16 10 4 0
You can use complete()
from the tidyr
package to create the missing rows and then you can simply subset:
library(tidyr)
D2 <- complete(D, score, person, fill = list(freq = 0))
D2[D2$score == 10, ]
## Source: local data frame [4 x 3]
##
## score person freq
## (int) (int) (dbl)
## 1 10 1 3
## 2 10 2 5
## 3 10 3 4
## 4 10 4 0
complete()
takes as the first argument the data frame that it should work with. Then follow the names of the columns that should be completed. The argument fill
is a list that gives for each of the remaining columns (which is only freq
here) the value they should be filled with.
As suggested by docendo-discimus, this can be further simplified by using also the dplyr
package as follows:
library(tidyr)
library(dplyr)
complete(D, score, person, fill = list(freq = 0)) %>% filter(score == 10)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With