Get all possible combinations in a time-series data with variable daily readings

Question

I have a time-series dataset of daily consumption which looks like the following:

consumption <- data.frame(
  date = as.Date(c('2020-06-01','2020-06-02','2020-06-03','2020-06-03',
                   '2020-06-03','2020-06-04','2020-06-05','2020-06-05')),
  val = c(10,20,31,32,33,40,51,52)
)

consumption <- consumption %>%
  group_by(date) %>%
  mutate(n = n(), record = row_number()) %>%
  ungroup()

consumption

# A tibble: 8 × 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    31     3      1
4 2020-06-03    32     3      2
5 2020-06-03    33     3      3
6 2020-06-04    40     1      1
7 2020-06-05    51     2      1
8 2020-06-05    52     2      2

Some days have more than one rows in the dataset. I would like to transform this into split groups with all possible combinations such as:

Group 1:

        date val record
1 2020-06-01  10      1
2 2020-06-02  20      1
3 2020-06-03  31      1
4 2020-06-04  40      1
5 2020-06-05  51      1

Group 2:

        date val record
1 2020-06-01  10      1
2 2020-06-02  20      1
3 2020-06-03  31      1
4 2020-06-04  40      1
5 2020-06-05  52      2

Group 3:

        date val record
1 2020-06-01  10      1
2 2020-06-02  20      1
3 2020-06-03  32      2
4 2020-06-04  40      1
5 2020-06-05  51      1

Group 4:

        date val record
1 2020-06-01  10      1
2 2020-06-02  20      1
3 2020-06-03  32      2
4 2020-06-04  40      1
5 2020-06-05  52      2

Group 5:

        date val record
1 2020-06-01  10      1
2 2020-06-02  20      1
3 2020-06-03  33      3
4 2020-06-04  40      1
5 2020-06-05  51      1

Group 6:

        date val record
1 2020-06-01  10      1
2 2020-06-02  20      1
3 2020-06-03  33      3
4 2020-06-04  40      1
5 2020-06-05  52      2

I've tried the following solution, but it does not produce the desired results.

library(dplyr)
library(purrr)
out <- consumption %>% 
   filter(n > 1) %>%
    group_split(date, rn = row_number()) %>% 
    map(~ bind_rows(consumption %>%
          filter(n == 1), .x %>%
             select(-rn)) %>% 
         arrange(date))

Your help to getting around this would be much appreciated.

Many thanks,

akrun · Accepted Answer

We could filter where the 'record' is greater than 1, group_split by 'row_number' and 'date', then bind the rows with the filtered data where the 'record' is 1

library(dplyr)
library(purrr)
out <- consumption %>% 
   filter(n > 1) %>%
    group_split(date, rn = row_number()) %>% 
    map(~ bind_rows(consumption %>%
          filter(n == 1), .x %>%
             select(-rn)) %>% 
         arrange(date))

-output

> out
[[1]]
# A tibble: 4 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    31     3      1
4 2020-06-04    40     1      1

[[2]]
# A tibble: 4 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    32     3      2
4 2020-06-04    40     1      1

[[3]]
# A tibble: 4 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    33     3      3
4 2020-06-04    40     1      1

With the updated data, we create the row_number(), then split it by 'date' column (as in @ThomasIsCoding solution), use crossing (from purrr) to expand the data, and loop over the rows with pmap, slice the rows of the original data based on the row index

library(tidyr)
library(tibble)
consumption %>%
     transmute(date, rn = row_number()) %>%
     deframe %>%
     split(names(.)) %>%
     invoke(crossing, .) %>%
     pmap(~ consumption %>% 
        slice(c(...))) %>%
     unname

-output

[[1]]
# A tibble: 5 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    31     3      1
4 2020-06-04    40     1      1
5 2020-06-05    51     2      1

[[2]]
# A tibble: 5 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    31     3      1
4 2020-06-04    40     1      1
5 2020-06-05    52     2      2

[[3]]
# A tibble: 5 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    32     3      2
4 2020-06-04    40     1      1
5 2020-06-05    51     2      1

[[4]]
# A tibble: 5 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    32     3      2
4 2020-06-04    40     1      1
5 2020-06-05    52     2      2

[[5]]
# A tibble: 5 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    33     3      3
4 2020-06-04    40     1      1
5 2020-06-05    51     2      1

[[6]]
# A tibble: 5 x 4
  date         val     n record
  <date>     <dbl> <int>  <int>
1 2020-06-01    10     1      1
2 2020-06-02    20     1      1
3 2020-06-03    33     3      3
4 2020-06-04    40     1      1
5 2020-06-05    52     2      2

Get all possible combinations in a time-series data with variable daily readings

Tags:

r

dplyr

purrr

M.Qasim

1 Answers

akrun

Recent Activity

Donate For Us

Get all possible combinations in a time-series data with variable daily readings

Tags:

r

dplyr

purrr

M.Qasim

1 Answers

akrun

Related questions

Recent Activity

Donate For Us