I have a data table and I want to create a new column for each unique day, and then assign a 1 in each row where the day matches the column name
I have done this using a for loop but I was wondering if there was any way to optimise it using data.table and set?
Here is an example
dt <- data.table(Week_Day = c("Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "Sunday"))
Day <- unique(dt$Week_Day)
for (i in 1:length(Day)) {
if (Day[i] != "Sunday") {
dt[, Day[i] := ifelse(Week_Day == Day[i], 1, 0)]
}
}
my table is 298k rows and although it doesn't take long to execute (below), its part of a long script and I have quite a few inefficient loops so I am trying to get the overall run time down.
Time to run:
user system elapsed
0.99 0.06 1.05
Thanks in advance.
Here's a different approach that, performs better - on my machine - than the original approach in the question
1) Get unique days except Sunday
Day <- setdiff(dt$Week_Day, "Sunday")
2) Initialize new columns with 0:
dt[, (Day) := 0L]
3) Update with 1s by reference in a loop:
for(x in Day) {
set(dt, i = which(dt[["Week_Day"]] == x), j = x, value = 1L)
}
Simple performance comparison:
dt1 <- data.table(Week_Day = sample(c("Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "Sunday"), 3e5, TRUE))
dt2 <- copy(dt1)
system.time({
Day <- setdiff(unique(dt$Week_Day), "Sunday")
dt1[, (Day) := 0L]
for(x in Day) {
set(dt1, i = which(dt1[["Week_Day"]] == x), j = x, value = 1L)
}
})
# User System verstrichen
# 0.029 0.003 0.032
system.time({
Day <- unique(dt$Week_Day)
for (i in 1:length(Day)) {
if (Day[i] != "Sunday") {
dt2[, Day[i] := ifelse(Week_Day == Day[i], 1L, 0L)]
}
}
})
# User System verstrichen
# 0.138 0.070 0.210
all.equal(dt1, dt2)
#[1] TRUE
Here is one attempt at a speed up:
Day <- unique(dt$Week_Day)
setkey(dt, Week_Day)
# create columns of 0s
dt[, (Day) := 0L]
for (i in seq_along(head(Day, -1))) {
dt[Day[i], Day[i] := 1L]
}
This implements a couple of the data.table
speed ups including binary search in the second chain and the elimination of ifelse
with replacement by reference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With