I have a huge data frame that looks like this
gene=c("A","A","A","A","B","B")
frequency=c(abs(rnorm(6,0.5,1)))
time=c(1,2,3,4,1,2)
df <- data.frame(gene,frequency,time)
  gene  frequency time
1    A 0.08463914    1
2    A 1.55639512    2
3    A 1.24172246    3
4    A 0.75038980    4
5    B 1.13189855    1
6    B 0.56896895    2
For the gene B I have data only for the time points 1 and 2. I want to fill the data of time point 3 and 4 with zeros so as my data look like this
  gene  frequency time
1    A 0.08463914    1
2    A 1.55639512    2
3    A 1.24172246    3
4    A 0.75038980    4
5    B 1.13189855    1
6    B 0.56896895    2
7    B      0        3
8    B      0        4
Overall I have multiple groups (aka genes) that I want to do this for. Any help or hint are highly appreciated.
We can use complete
library(dplyr)
library(tidyr)
df %>% 
    complete(gene, time = 1:4, fill = list(frequency = 0)) %>%
    select(names(df))
-output
# A tibble: 8 x 3
  gene  frequency  time
  <chr>     <dbl> <dbl>
1 A         0.590     1
2 A         0.762     2
3 A         0.336     3
4 A         0.437     4
5 B         0.904     1
6 B         1.97      2
7 B         0         3
8 B         0         4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With