R - Insert Missing Numbers in A Sequence by Group's Max Value

Question

I'd like to insert missing numbers in the index column following these two conditions:

Partitioned by multiple columns
The minimum value is always 1
The maximum value is always the maximum for the group and type

Current Data:

group   type    index   vol
A       1       1       200
A       1       2       244
A       1       5       33

A       2       2       66
A       2       3       2
A       2       4       199
A       2       10      319

B       1       4       290
B       1       5       188
B       1       6       573
B       1       9       122

Desired Data:

group   type    index   vol
A       1       1       200
A       1       2       244
A       1       3       0
A       1       4       0
A       1       5       33

A       2       1       0
A       2       2       66
A       2       3       2
A       2       4       199
A       2       5       0
A       2       6       0
A       2       7       0
A       2       8       0
A       2       9       0
A       2       10      319

B       1       1       0
B       1       2       0
B       1       3       0
B       1       4       290
B       1       5       188
B       1       6       573
B       1       7       0
B       1       8       0
B       1       9       122

I've just added in spaces between the partitions for clarity.

Hope you can help out!

kath · Accepted Answer

You can do the following

library(dplyr)
library(tidyr)

my_df %>% 
  group_by(group, type) %>% 
  complete(index = 1:max(index), fill = list(vol = 0))

#    group type index vol
# 1      A    1     1 200
# 2      A    1     2 244
# 3      A    1     3   0
# 4      A    1     4   0
# 5      A    1     5  33
# 6      A    2     1   0
# 7      A    2     2  66
# 8      A    2     3   2
# 9      A    2     4 199
# 10     A    2     5   0
# 11     A    2     6   0
# 12     A    2     7   0
# 13     A    2     8   0
# 14     A    2     9   0
# 15     A    2    10 319
# 16     B    1     1   0
# 17     B    1     2   0
# 18     B    1     3   0
# 19     B    1     4 290
# 20     B    1     5 188
# 21     B    1     6 573
# 22     B    1     7   0
# 23     B    1     8   0
# 24     B    1     9 122

With group_by you specify the groups you indicated withed the white spaces. With complete you specify which columns should be complete and then what values should be filled in for the remaining column (default would be NA)

Data

my_df <- 
  structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
                 type = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), 
                 index = c(1L, 2L, 5L, 2L, 3L, 4L, 10L, 4L, 5L, 6L, 9L), 
                 vol = c(200L, 244L, 33L, 66L, 2L, 199L, 319L, 290L, 188L, 573L, 122L)), 
            class = "data.frame", row.names = c(NA, -11L))

R - Insert Missing Numbers in A Sequence by Group's Max Value

Tags:

r

lostinsql

1 Answers

kath

Recent Activity

Donate For Us

R - Insert Missing Numbers in A Sequence by Group's Max Value

Tags:

r

lostinsql

1 Answers

kath

Related questions

Recent Activity

Donate For Us