I have data on this form in a data.table DT:
DT = data.table(
year=c('1981', '1981', '1981', '2005', '2005', '2005'),
value=c(2, 8, 16, 3, 9, 27),
order =c(1,2,3,1,2,3))
| year | value | order |
|---|---|---|
| '1981' | 2 | 1 |
| '1981' | 8 | 2 |
| '1981' | 16 | 3 |
| '2005' | 3 | 1 |
| '2005' | 9 | 2 |
| '2005' | 27 | 3 |
And I want to create new columns based first on the order within a specific year, but then sequentially on the order if I shift it. As you can see value=16 which starts as order=3 on row 1, is logged as order = 2 on row 2, etc.
| year | order1 | order2 | order3 |
|---|---|---|---|
| '1981' | 2 | 8 | 16 |
| '1981' | 8 | 16 | NA |
| '1981' | 16 | NA | NA |
| '2005' | 3 | 9 | 27 |
| '2005' | 9 | 27 | NA |
| '2005' | 27 | NA | NA |
If I wanted it just by order, and get rows 1 and 4 as output, I could do:
dcast(DT, year ~ order, value.var = c('value'))
But how can I cast based on order while incorporating this reordering?
I could perhaps create new columns indicating the new shifted order, using:
DT[,order_2:= c(NA,1,2,NA,1,2)]
DT[,order_3:= c(NA,NA,1,NA,NA,1)]
But then how do I do casting on all three columns? Is there a more elegant way than just casting 3 times and then joining the results?
You don't necessarily need dcast, try this:
DT[, lapply(seq_along(value), \(v) {l <- length(value); `length<-`(value[v:l], l)}), by=year]
# year V1 V2 V3
# 1: 1981 2 8 16
# 2: 1981 8 16 NA
# 3: 1981 16 NA NA
# 4: 2005 3 9 27
# 5: 2005 9 27 NA
# 6: 2005 27 NA NA
using shift(value, 0:-2) makes this a one liner:
setnames(DT[, shift(value,(-0:-2)),year], c("year", paste0("order",1:3)))[]
Output:
year order1 order2 order3
1: 1981 2 8 16
2: 1981 8 16 NA
3: 1981 16 NA NA
4: 2005 3 9 27
5: 2005 9 27 NA
6: 2005 27 NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With