Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

check whether steps in a counter variable are missing

Tags:

r

dplyr

tidyr

I have a datafile with one row per participants (named 1-x, based on the study they took part in). I want to check whether all participants are present in the dataset. This is my toy dataset, personid are the participants, study is the study they took part in.

df <- read.table(text = "personid study measurement
1         x     23
2         x     32
1         y     21
3         y     23
4         y     23
6         y     23", header=TRUE)

which looks like this:

  personid study measurement
1        1    x          23
2        2    x          32
3        1    y          21
4        3    y          23
5        4    y          23
6        6    y          23

so for y, I am missing participants 2 and 5. How do I check that automatically? I tried adding a counter variable and comparing that counter variable to the participant id but once one participant is missing, the comparison is meaningless because the alignment is off.

df %>% group_by(study) %>% mutate(id = 1:n(),check = id==personid)
Source: local data frame [6 x 5]
Groups: date [2]

  personid   study measurement    id check
     <int> <fctr>       <int> <int> <lgl>
1        1      x          23     1  TRUE
2        2      x          32     2  TRUE
3        1      y          21     1  TRUE
4        3      y          23     2 FALSE
5        4      y          23     3 FALSE
6        6      y          23     4 FALSE
like image 500
Esther Avatar asked Jan 05 '23 05:01

Esther


1 Answers

Assuming your personid is sequential, then you can do this using setdiff, i.e.

library(dplyr)

df %>% 
 group_by(study) %>% 
 mutate(new = toString(setdiff(max(personid):min(personid), personid)))

#Source: local data frame [6 x 4]
#Groups: study [2]

#  personid  study measurement   new
#     <int> <fctr>       <int> <chr>
#1        1      x          23      
#2        2      x          32      
#3        1      y          21  5, 2
#4        3      y          23  5, 2
#5        4      y          23  5, 2
#6        6      y          23  5, 2
like image 110
Sotos Avatar answered Jan 06 '23 19:01

Sotos