I have a data frame with a sequence in 'col1' and values in 'col2':
col1 col2
2 0.02
5 0.12
9 0.91
13 1.13
I want to expand the irregular sequence in 'col1' with a regular sequence from 1 to 13. For the values in 'col1' which are missing in the original data, I want 'col2' to have the value 0
in the final output:
col1 col2
1 0
2 0.02
3 0
4 0
5 0.12
6 0
7 0
8 0
9 0.91
10 0
11 0
12 0
13 1.13
How can I do this in R?
To find the missing numbers in a sequence in R data frame column, we can use setdiff function.
First, if we want to exclude missing values from mathematical operations use the na. rm = TRUE argument. If you do not exclude these values most functions will return an NA . We may also desire to subset our data to obtain complete observations, those observations (rows) in our data that contain no missing data.
We can use complete. cases() to print a logical vector that indicates complete and missing rows (i.e. rows without NA). Rows 2 and 3 are complete; Rows 1, 4, and 5 have one or more missing values. We can also create a complete subset of our example data by using the complete.
Just for completeness, a self binary join using data.table
(you will get NA
s instead of zeroes, but that could be easily changed if needed)
library(data.table)
setDT(df)[.(seq(max(col1))), on = .(col1)]
# col1 col2
# 1: 1 NA
# 2: 2 0.02
# 3: 3 NA
# 4: 4 NA
# 5: 5 0.12
# 6: 6 NA
# 7: 7 NA
# 8: 8 NA
# 9: 9 0.91
# 10: 10 NA
# 11: 11 NA
# 12: 12 NA
# 13: 13 1.13
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With