Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count Based on Value of Another Column in R

Tags:

r

I am trying to create a new column (called Error_1_Count) in my dataframe which counts the number of times 'Error Type 1' appears in a column called 'Error' for each different value of 'Name'. An example of what I'd like my resulting dataframe is below.

I have tried creating a loop with an assignment based on the error (see below), however, the count is not correct in my output (only results in 0 and 1).

Please let me know how I can improve my code and ensure that the count only gets reset for new values of 'Name'. Thank you!

Goal Result in Table


Name       Error         Error_1_Count
A       Error Type 1          1
A       Error Type 4          1
A       Error Type 1          2
B       Error Type 2          0
A       Error Type 1          3
C       Error Type 3          0
D       Error Type 1          1


names <- unique(data.df$name)
count <- 0

for (i in names) {

  data.df[data.df$name == i, data.df$error_1_count <- ifelse(data.df$error == 'Error Type 1', count + 1, count)]

}


#View(data.df)
#print(unique(data.df$error_1_count))


like image 265
piper180 Avatar asked Jan 26 '23 13:01

piper180


2 Answers

You can use ave and cumsum.

x$Error_1_Count <- ave(x$Error == "Error Type 1", x$Name, FUN=cumsum)
x
#  Name        Error Error_1_Count
#1    A Error Type 1             1
#2    A Error Type 4             1
#3    A Error Type 1             2
#4    B Error Type 2             0
#5    A Error Type 1             3
#6    C Error Type 3             0
#7    D Error Type 1             1

Data:

x <- structure(list(Name = structure(c(1L, 1L, 1L, 2L, 1L, 3L, 4L), .Label = c("A", 
"B", "C", "D"), class = "factor"), Error = structure(c(1L, 4L, 
1L, 2L, 1L, 3L, 1L), .Label = c("Error Type 1", "Error Type 2", 
"Error Type 3", "Error Type 4"), class = "factor")), row.names = c(NA, 
-7L), class = "data.frame")
like image 103
GKi Avatar answered Jan 28 '23 02:01

GKi


A similar idea with dplyr

library(dplyr)
df1 %>%
     group_by(Name) %>%
     mutate(Error = cumsum(Error == "Error Type 1"))

data

df1 <- structure(list(Name = c("A", "A", "A", "B", "A", "C", "D"), Error = c("Error Type 1", 
"Error Type 4", "Error Type 1", "Error Type 2", "Error Type 1", 
"Error Type 3", "Error Type 1")), row.names = c(NA, -7L), class = "data.frame")
like image 36
akrun Avatar answered Jan 28 '23 02:01

akrun