I have a question regarding data.table in R
i have a dataset like this
data <- data.table(a=c(1:7,12,32,13),b=c(1,5,6,7,8,3,2,5,1,4))
a b
1: 1 1
2: 2 5
3: 3 6
4: 4 7
5: 5 8
6: 6 3
7: 7 2
8: 12 5
9: 32 1
10: 13 4
Now i want to generate a third column c, which gonna compare the value of each row of a, to all previous values of b and check if there is any value of b is bigger than a. For e.g, at row 5, a=5, and previous value of b is 1,5,6,7. so 6 and 7 is bigger than 5, therefore value of c should be 1, otherwise it would be 0. The result should be like this
a b c
1: 1 1 NA
2: 2 5 0
3: 3 6 1
4: 4 7 1
5: 5 8 1
6: 6 3 1
7: 7 2 1
8: 12 5 0
9: 32 1 0
10: 13 4 0
I tried with a for loop but it takes a very long time. I also tried shift but i can not refer to multiple previous rows with shift. Anyone has any recommendation?
R – Get Specific Row of Matrix To get a specific row of a matrix, specify the row number followed by a comma, in square brackets, after the matrix variable name. This expression returns the required row as a vector.
The next record in R For the next record, you have to return the same column, but remove the first record and add NA at the end or other appropriate values. If you are dealing with subgroups in data like I do, then check if the next row contains the same category.
library(data.table)
data <- data.table(a=c(1:7,12,32,13),b=c(1,5,6,7,8,3,2,5,1,4))
data[,c:= a <= shift(cummax(b))]
This is a base R solution (see the dplyr
solution below):
data$c = NA
data$c[2:nrow(data)] <- sapply(2:nrow(data), function(x) { data$c[x] <- any(data$a[x] < data$b[1:(x-1)]) } )
## a b c
## 1: 1 1 NA
## 2: 2 5 0
## 3: 3 6 1
## 4: 4 7 1
## 5: 5 8 1
## 6: 6 3 1
## 7: 7 2 1
## 8: 12 5 0
## 9: 32 1 0
## 10: 13 4 0
EDIT
Here is a simpler solution using dplyr
library(dplyr)
### Given the cumulative max and comparing to 'a', set see to 1/0.
data %>% mutate(c = ifelse(a < lag(cummax(b)), 1, 0))
## a b c
## 1 1 1 NA
## 2 2 5 0
## 3 3 6 1
## 4 4 7 1
## 5 5 8 1
## 6 6 3 1
## 7 7 2 1
## 8 12 5 0
## 9 32 1 0
## 10 13 4 0
### Using 'shift' with dplyr
data %>% mutate(c = ifelse(a <= shift(cummax(b)), 1, 0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With