When I use the complete() function to fill in rows in my data that have no cases I find it is creating many duplicate rows as well. These can be removed with the unique() function, but I want to understand how I can avoid generating all these extra rows in the first place.
library(dplyr)
library(tidyr)
# An incomplete table
mtcars %>%
group_by(vs, cyl) %>%
count()
# complete() creates a table with many duplicate rows
temp <-
mtcars %>%
group_by(vs, cyl) %>%
count() %>%
complete(vs = c(0, 1), cyl = c(4, 6, 8), fill = list(n = 0))
unique(temp)
This is answered in a comment by @aosmith.
The duplicates come from the grouped data. Ungrouping using ungroup
solves the issue:
temp <-
mtcars %>%
group_by(vs, cyl) %>%
count() %>%
ungroup() %>%
complete(vs = c(0, 1), cyl = c(4, 6, 8), fill = list(n = 0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With