ifelse function group in group in R

Question

I have data set

ID <- c(1,1,2,2,2,2,3,3,3,3,3,4,4,4)
Eval <- c("A","A","B","B","A","A","A","A","B","B","A","A","A","B")
med <- c("c","d","k","k","h","h","c","d","h","h","h","c","h","k")
df <- data.frame(ID,Eval,med)
> df
    ID Eval med
 1   1    A   c
 2   1    A   d
 3   2    B   k
 4   2    B   k
 5   2    A   h
 6   2    A   h
 7   3    A   c
 8   3    A   d
 9   3    B   h
 10  3    B   h
 11  3    A   h
 12  4    A   c
 13  4    A   h
 14  4    B   k

I try to create variable x and y, group by ID and Eval. For each ID, if Eval = A, and med = "h" or "k", I set x = 1, other wise x = 0, if Eval = B and med = "h" or "k", I set y = 1, other wise y = 0. I use the way I don't like it, I got answer but it seem like not that great

df <- data.table(df)
setDT(df)[, count := uniqueN(med) , by = .(ID,Eval)]
setDT(df)[Eval == "A", x:= ifelse(count == 1 & med %in% c("k","h"),1,0), by=ID]
setDT(df)[Eval == "B", y:= ifelse(count == 1 & med %in% c("k","h"),1,0), by=ID]


     ID Eval med count  x  y
 1:  1    A   c     2  0 NA
 2:  1    A   d     2  0 NA
 3:  2    B   k     1 NA  1
 4:  2    B   k     1 NA  1
 5:  2    A   h     1  1 NA
 6:  2    A   h     1  1 NA
 7:  3    A   c     3  0 NA
 8:  3    A   d     3  0 NA
 9:  3    B   h     1 NA  1
10:  3    B   h     1 NA  1
11:  3    A   h     3  0 NA
12:  4    A   c     2  0 NA
13:  4    A   h     2  0 NA
14:  4    B   k     1 NA  1

Then I need to collapse the row to get unique ID, I don't know how to collapse rows, any idea?

The output

akrun · Accepted Answer

We create the 'x' and 'y' variables grouped by 'ID' without the NA elements directly coercing the logical vector to binary (as.integer)

df[, x := as.integer(Eval == "A" & count ==1 & med %in% c("h", "k")) , by = ID]

and similarly for 'y'

df[, y := as.integer(Eval == "B" & count ==1 & med %in% c("h", "k")) , by = ID]

and summarise it, using any after grouping by "ID"

df[, lapply(.SD, function(x) as.integer(any(x))) , ID, .SDcols = x:y]
#   ID x y
#1:  1 0 0
#2:  2 1 1
#3:  3 0 1
#4:  4 0 1

If we need a compact approach, instead of assinging (:=), we summarise the output grouped by "ID", "Eval" based on the conditions and then grouped by 'ID', we check if there is any TRUE values in 'x' and 'y' by looping over the columns described in the .SDcols.

setDT(df)[,  if(any(uniqueN(med)==1 & med %in% c("h", "k"))) {
        .(x= Eval=="A", y= Eval == "B") } else .(x=FALSE, y=FALSE),
     by = .(ID, Eval)][, lapply(.SD, any) , by = ID, .SDcols = x:y]
#  ID     x     y
#1:  1 FALSE FALSE
#2:  2  TRUE  TRUE
#3:  3 FALSE  TRUE
#4:  4 FALSE  TRUE

If needed, we can convert to binary similar to the approach showed in the first solution.

ifelse function group in group in R

Tags:

r

duplicates

if-statement

data.table

BIN

1 Answers

akrun

Recent Activity

Donate For Us

ifelse function group in group in R

Tags:

r

duplicates

if-statement

data.table

BIN

1 Answers

akrun

Related questions

Recent Activity

Donate For Us