If anyone mind lending some knowledge... What I am trying to do is make a new dataframe based on the below data frame values.
id value
ant 10
cat 4
cat 6
dog 5
dog 3
dog 2
fly 9
What I want to do next is, in sequential order I want to make a dataframe that looks like the following.
It should look like this
ant cat dog fly
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 1 0 0
0 1 1 0
0 1 1 0
0 1 1 0
0 1 0 0
0 1 0 NA
I know how to do this the long way by
newdf <- data.frame(matrix(2, ncol = length(unique(df[,"id"])) , nrow = 10))
newdf$X1[1:10] <- 0
newdf$X2[1:4] <- 0
newdf$X2[5:10] <- 1
...
However, is there any way to do this more efficiently? Note that my actual data will have roughly 50 rows so that's why I am looking for a more efficient way to complete this!
Here's a tidyverse answer -
library(dplyr)
library(tidyr)
df %>%
group_by(id) %>%
mutate(val = rep(c(0, 1), length.out = n())) %>%
uncount(value) %>%
mutate(row = row_number()) %>%
complete(row = 1:10) %>%
pivot_wider(names_from = id, values_from = val) %>%
select(-row)
# ant cat dog fly
# <dbl> <dbl> <dbl> <dbl>
# 1 0 0 0 0
# 2 0 0 0 0
# 3 0 0 0 0
# 4 0 0 0 0
# 5 0 1 0 0
# 6 0 1 1 0
# 7 0 1 1 0
# 8 0 1 1 0
# 9 0 1 0 0
#10 0 1 0 NA
For each id we assign an alternate 0, 1 value and use uncount to repeat the rows based on the count. Get the data in wide format so that we have a separate column for each id.
data
df <- structure(list(id = c("ant", "cat", "cat", "dog", "dog", "dog",
"fly"), value = c(10, 4, 6, 5, 3, 2, 9)), row.names = c(NA, -7L
), class = "data.frame")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With