Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting n cells below matched column cell sequence [duplicate]

Tags:

r

I need to select n cells below the matched column sequence. Let's assume n = 2 and the specific column sequence is A, B and C. After this sequence, I would like to pick up two more cells below this sequence. My table is:

Table1 <- data.frame(ID=rep(c(1 ,2  ,3  ,4  ,5  ,6  ,7  ,8  ,9  ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30 ,31)), Sequence=rep(c("A",  "B",    "D",    "E",    "A",    "B",    "C",    "f",    "n",    "p",    "C",    "D",    "D",    "E",    "A",    "B",    "C",    "z",    "t",    "g",    "A",    "C",    "D",    "A",    "B",    "C",    "p",    "l",    "x",    "v",    "A")))

I like to have this Table:

Table2 <- data.frame(ID=rep(c(1 ,2  ,3  ,4  ,5  ,6  ,7  ,8  ,9  ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30 ,31)), Sequence=rep(c("A",  "B",    "D",    "E",    "A",    "B",    "C",    "f",    "n",    "p",    "C",    "D",    "D",    "E",    "A",    "B",    "C",    "z",    "t",    "g",    "A",    "C",    "D",    "A",    "B",    "C",    "p",    "l",    "x",    "v",    "A")), Selected=rep(c("",   "", "", "", "A",    "B",    "C",    "f",    "n",    "", "", "", "", "", "A",    "B",    "C",    "z",    "t",    "", "", "", "", "A",    "B",    "C",    "p",    "l",    "", "", "")))

Can you please help me with this?

like image 883
Farshid Owrang Avatar asked Dec 04 '22 17:12

Farshid Owrang


2 Answers

Another solution is to subset based on the pattern and merge to create your resulting column, i.e.

library(dplyr)
idx <- which(Table1$Sequence == "A" & lead(Table1$Sequence) == "B" & lead(Table1$Sequence, n = 2) == "C")
idx <- c(sapply(idx, function(i)seq(i, i+4)))
merge(Table1, Table1[idx,], by = 'ID', all = TRUE)

   ID Sequence.x Sequence.y
1   1          A       <NA>
2   2          B       <NA>
3   3          D       <NA>
4   4          E       <NA>
5   5          A          A
6   6          B          B
7   7          C          C
8   8          f          f
9   9          n          n
10 10          p       <NA>
11 11          C       <NA>
12 12          D       <NA>
13 13          D       <NA>
14 14          E       <NA>
15 15          A          A
16 16          B          B
17 17          C          C
18 18          z          z
19 19          t          t
20 20          g       <NA>
21 21          A       <NA>
22 22          C       <NA>
23 23          D       <NA>
24 24          A          A
25 25          B          B
26 26          C          C
27 27          p          p
28 28          l          l
29 29          x       <NA>
30 30          v       <NA>
31 31          A       <NA>
like image 73
Sotos Avatar answered Dec 06 '22 05:12

Sotos


Using base split and cumsum:

x = c("A", "B", "C")
do.call(rbind, 
        lapply(split(Table1, cumsum(Table1$Sequence == x[ 1 ])),
               function(i){
                 s <- sum(i$Sequence[1:3] == x)
                 if(is.na(s) | s < 3){
                   cbind(i, res = NA)
                 } else { cbind(i, res = c(i$Sequence[1:5], rep(NA, nrow(i) - 5))) }
               }))

Output:

#      ID Sequence  res
# 1.1   1        A <NA>
# 1.2   2        B <NA>
# 1.3   3        D <NA>
# 1.4   4        E <NA>
# 2.5   5        A    A
# 2.6   6        B    B
# 2.7   7        C    C
# 2.8   8        f    f
# 2.9   9        n    n
# 2.10 10        p <NA>
# 2.11 11        C <NA>
# 2.12 12        D <NA>
# 2.13 13        D <NA>
# 2.14 14        E <NA>
# 3.15 15        A    A
# 3.16 16        B    B
# 3.17 17        C    C
# 3.18 18        z    z
# 3.19 19        t    t
# 3.20 20        g <NA>
# 4.21 21        A <NA>
# 4.22 22        C <NA>
# 4.23 23        D <NA>
# 5.24 24        A    A
# 5.25 25        B    B
# 5.26 26        C    C
# 5.27 27        p    p
# 5.28 28        l    l
# 5.29 29        x <NA>
# 5.30 30        v <NA>
# 6    31        A <NA>
like image 45
zx8754 Avatar answered Dec 06 '22 07:12

zx8754