I want to conditionally replace missing revenue up to 16th July 2017 with zero using tidyverse.
My Data
library(tidyverse)
library(lubridate)
df<- tribble(
~Date, ~Revenue,
"2017-07-01", 500,
"2017-07-02", 501,
"2017-07-03", 502,
"2017-07-04", 503,
"2017-07-05", 504,
"2017-07-06", 505,
"2017-07-07", 506,
"2017-07-08", 507,
"2017-07-09", 508,
"2017-07-10", 509,
"2017-07-11", 510,
"2017-07-12", NA,
"2017-07-13", NA,
"2017-07-14", NA,
"2017-07-15", NA,
"2017-07-16", NA,
"2017-07-17", NA,
"2017-07-18", NA,
"2017-07-19", NA,
"2017-07-20", NA
)
df$Date <- ymd(df$Date)
Date up to which I want to conditionally replace NAs
max.date <- ymd("2017-07-16")
Output I desire
# A tibble: 20 × 2
Date Revenue
<chr> <dbl>
1 2017-07-01 500
2 2017-07-02 501
3 2017-07-03 502
4 2017-07-04 503
5 2017-07-05 504
6 2017-07-06 505
7 2017-07-07 506
8 2017-07-08 507
9 2017-07-09 508
10 2017-07-10 509
11 2017-07-11 510
12 2017-07-12 0
13 2017-07-13 0
14 2017-07-14 0
15 2017-07-15 0
16 2017-07-16 0
17 2017-07-17 NA
18 2017-07-18 NA
19 2017-07-19 NA
20 2017-07-20 NA
The only way I could work this out was to split the df into several parts, update for NAs
and then rbind
the whole lot.
Could someone please help me do this efficiently using tidyverse.
We can mutate
the 'Revenue' column to replace
the NA
with 0 using a logical condition that checks whether the element is NA and the 'Date' is less than or equal to 'max.date'
df %>%
mutate(Revenue = replace(Revenue, is.na(Revenue) & Date <= max.date, 0))
# A tibble: 20 x 2
# Date Revenue
# <date> <dbl>
# 1 2017-07-01 500
# 2 2017-07-02 501
# 3 2017-07-03 502
# 4 2017-07-04 503
# 5 2017-07-05 504
# 6 2017-07-06 505
# 7 2017-07-07 506
# 8 2017-07-08 507
# 9 2017-07-09 508
#10 2017-07-10 509
#11 2017-07-11 510
#12 2017-07-12 0
#13 2017-07-13 0
#14 2017-07-14 0
#15 2017-07-15 0
#16 2017-07-16 0
#17 2017-07-17 NA
#18 2017-07-18 NA
#19 2017-07-19 NA
#20 2017-07-20 NA
It can be achieved with data.table
by specifying the logical condition in 'i and assigning (:=
) the 'Revenue' to 0
library(data.table)
setDT(df)[is.na(Revenue) & Date <= max.date, Revenue := 0]
Or with base R
df$Revenue[is.na(df$Revenue) & df$Date <= max.date] <- 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With