Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obtaining the number of Days it takes to reach the desired amount

Tags:

r

dplyr

lubridate

I currently have a data frame which contains Name of Station, Date of Rainfall, Amount of Rainfall (Example attached) I am interested in exploring the number of days (and/or months) it takes each Station to reach a particular amount of rainfall. For example:

Is it possible to obtain an output like the one above based on a dataset like the example one? My initial thought process is to filter each station individually, join to a calendar dataframe which extracts the min and max from that range, count the days between them and use a case_when to categorize them. This approach seems a bit convoluted and would appreciate any guidance as to what would be a better approach.

Thanks for the suggestions!

Example Dataset:

Example <- structure(list(Name.Station = c("Station A", "Station A", "Station A", 
                                        "Station A", "Station A", "Station B", "Station B", "Station B", 
                                        "Station C", "Station C", "Station C", "Station C"), Rainfall.Date = c("7/10/2020", 
                                                                                                               "8/12/2020", "8/01/2021", "25/06/2021", "26/10/2021", "7/01/2020", 
                                                                                                               "22/01/2020", "5/02/2020", "5/09/2020", "5/10/2020", "5/11/2020", 
                                                                                                               "5/12/2020"), Rainfall.Amount = c(210, 210, 208.47, 208.16, 203.67, 
                                                                                                                                                 227.49, 225, 222.54, 250, 250, 246.18, 245.15)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                      -12L))
like image 878
Raul Avatar asked Oct 14 '25 19:10

Raul


2 Answers

by station you could calculate the cumsum of rainfall greater than the threshold in mm. Then calculate the length of the sequence of days from start date to the date which is maximum in the cumsum.

First of all, though, your dates should be formatted properly.

Example <- transform(Example, Rainfall.Date=as.Date(Rainfall.Date, '%d/%m/%Y'))

do.call(rbind, by(Example, Example$Name.Station, \(x) {
  f <- \(mm, x.=x) {
    mx <- which.max(cumsum(x.$Rainfall.Amount) > mm)
    length(do.call(seq.Date, c(as.list(range(x.$Rainfall.Date[1:mx])), 1)))
  }
  ds <- seq.int(200, 1e3, 200)  ## sequence of 200, 400, ... , 1000mm
  r <- t(vapply(ds, f, 0))
  data.frame(Name.Station=el(x$Name.Station), `colnames<-`(r, paste0('d_', ds)))
}))
#           Name.Station d_200 d_400 d_600 d_800 d_1000
# Station A    Station A     1    63    94   262    385
# Station B    Station B     1    16    30     1      1
# Station C    Station C     1    31    62    92      1

Note: R >= 4.1 used.

like image 65
jay.sf Avatar answered Oct 17 '25 08:10

jay.sf


Here is a tidyverse approach:

library(dplyr)
library(tidyr)

Example %>%
  group_by(Name.Station) %>%
  mutate(Rainfall.Date = as.Date(Rainfall.Date, "%d/%m/%Y"),
         days = cumsum(c(1, diff(Rainfall.Date))),
         crainfall = cumsum(Rainfall.Amount),
         fi = (findInterval(crainfall, seq(0, 1000, 200)) -1) * 200) %>%
  pivot_wider(id_cols = Name.Station, names_from = fi, values_from = days, names_glue = {"days_to_{fi}_mm"}, values_fn = min)

# A tibble: 3 x 6
# Groups:   Name.Station [3]
  Name.Station days_to_200_mm days_to_400_mm days_to_600_mm days_to_800_mm days_to_1000_mm
  <chr>                 <dbl>          <dbl>          <dbl>          <dbl>           <dbl>
1 Station A                 1             63             94            262             385
2 Station B                 1             16             30             NA              NA
3 Station C                 1             31             62             92              NA
like image 43
Ritchie Sacramento Avatar answered Oct 17 '25 08:10

Ritchie Sacramento