Lets pretend I have something like this:
df <- data.frame(
PERSON = c("Peter", "Peter", "Marcel" , "Lisa", "Lisa"),
FRUIT = c("Apple", "Peach","Apple", "Apple", "Peach" ),
A = c(100, 200, 100, 200, 300),
B=c(1,2,3,4,5) )
df$PERSON <- as.factor(df$Person)
df$FRUIT <- factor(df$FRUIT, levels = c("Apple", "Peach", "Coconut"))
Which resulsts in
str(df): 'data.frame': 5 obs. of 4 variables:
$ PERSON: Factor w/ 3 levels "Lisa","Marcel",..: 3 3 2 1 1
$ FRUIT : Factor w/ 3 levels "Apple","Peach",..: 1 2 1 1 2
$ A : num 100 200 100 200 300
$ B : num 1 2 3 4 5
I want to expand this data, frame so that for every PERSON there are all levels of FRUIT present, like this:
Person FRUIT A B
1 Peter Apple 100 1
2 Peter Peach 200 2
3 Peter Coconut 0 0
4 Marcel Apple 100 3
5 Marcel Peach 0 0
6 Marcel Coconut 0 0
7 Lisa Apple 200 4
8 Lisa Peach 300 5
9 Lisa Coconut 0 0
Missing values for A
and B
should be filled with 0.
I tried tidyr::complete(df$FRUIT, 0)
, but it seems, that I used this function wrong.
Removing Levels from a Factor in R Programming – droplevels() Function. droplevels() function in R programming used to remove unused levels from a Factor. droplevels(x, exclude = if(anyNA(levels(x))) NULL else NA, …)
The droplevels() function in R can be used to drop unused factor levels. This function is particularly useful if we want to drop factor levels that are no longer used due to subsetting a vector or a data frame. where x is an object from which to drop unused factor levels.
The complete
takes the first argument as 'data', followed by the columns to expand. By default, the fill
is NA, but we can change it to 0 by specifying it in a list
.
complete(df, PERSON, FRUIT, fill = list(A=0, B = 0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With