Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change multiple values in a dataframe based on two other values

Tags:

r

If anyone mind lending some knowledge... What I am trying to do is make a new dataframe based on the below data frame values.

id   value
ant    10
cat    4
cat    6
dog    5
dog    3
dog    2
fly    9

What I want to do next is, in sequential order I want to make a dataframe that looks like the following.

  • Every time we see a new id, we create a column. The max value is 10 so there should be 10 rows.
  • Our first word is ant and so therefore for every row of ant, I would like a 0.
  • Our next column is cat. We have 2 values and what I would like to do is for the first value we see, the first 4 rows must be 0 which is followed by 6 rows of 1.
  • Same logic for dog, with first five rows as 0 and next three rows as 1 and last 2 as 0.
  • Fly has only 9 rows of 0 and the last row should contain NA.

It should look like this

ant  cat  dog  fly
0    0    0    0
0    0    0    0
0    0    0    0
0    0    0    0
0    1    0    0
0    1    1    0
0    1    1    0
0    1    1    0
0    1    0    0
0    1    0    NA

I know how to do this the long way by

newdf <- data.frame(matrix(2, ncol = length(unique(df[,"id"])) , nrow = 10))
newdf$X1[1:10] <- 0
newdf$X2[1:4] <- 0
newdf$X2[5:10] <- 1
...

However, is there any way to do this more efficiently? Note that my actual data will have roughly 50 rows so that's why I am looking for a more efficient way to complete this!

like image 931
anonymous Avatar asked Mar 19 '26 19:03

anonymous


1 Answers

Here's a tidyverse answer -

library(dplyr)
library(tidyr)

df %>%
  group_by(id) %>%
  mutate(val = rep(c(0, 1), length.out = n())) %>%
  uncount(value) %>%
  mutate(row = row_number()) %>%
  complete(row = 1:10) %>%
  pivot_wider(names_from = id, values_from = val) %>%
  select(-row)

#     ant   cat   dog   fly
#   <dbl> <dbl> <dbl> <dbl>
# 1     0     0     0     0
# 2     0     0     0     0
# 3     0     0     0     0
# 4     0     0     0     0
# 5     0     1     0     0
# 6     0     1     1     0
# 7     0     1     1     0
# 8     0     1     1     0
# 9     0     1     0     0
#10     0     1     0    NA

For each id we assign an alternate 0, 1 value and use uncount to repeat the rows based on the count. Get the data in wide format so that we have a separate column for each id.

data

df <- structure(list(id = c("ant", "cat", "cat", "dog", "dog", "dog", 
"fly"), value = c(10, 4, 6, 5, 3, 2, 9)), row.names = c(NA, -7L
), class = "data.frame")
like image 177
Ronak Shah Avatar answered Mar 22 '26 10:03

Ronak Shah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!