I am having some troubles with the below code. It returns
"Error in summarise_impl(.data, dots) : Column rate must be length 1 (a summary value), not 22906"
Is there any problem with my code?
sub_grade
is of type character and int_rate
is numeric
results <- loan_data %>%
select(credit_grade, sub_grade, int_rate, loan_amnt) %>%
group_by(sub_grade) %>%
summarise(
rate = substr(int_rate * 100, 1, 4),
nr_loans = n(),
"&",
percent1 = substr((nr_loans / a) * 100, 1, 5),
klj = "&",
Amount = sum(loan_amnt, na.rm = TRUE),
klj1 = "&",
percent2 = substr((Amount / total) * 100, 1, 5)
)
The problem shows up only when I add the first variable rate
.
Reproducible example:
sub_grade <- c("A1", "A2", "A3","A1","A3")
int_rate <– c(0.023, 0.027, 0.033,0.023,0.033)
what I want is
sub_grade. int_rate
The problem is that dplyr::summarise
expects/accepts one value per group. But the substr(int_rate*100, ...)
in your code will return value for each row i.e. many value per group. You need to think of using some grouping functions like min, max, first, last etc
as part of substr
. Considering the sample data that OP has posted the solution could be as:
# Data
sub_grade <- c("A1", "A2", "A3","A1","A3")
int_rate <- c(0.023, 0.027,0.033,0.023,0.033)
loan_data <- data.frame(sub_grade, int_rate, stringsAsFactors = FALSE)
# Use dplyr to summarise on sub_grade
library(dplyr)
loan_data %>% group_by(sub_grade) %>%
summarise(int_rate = first(int_rate)) %>%
as.data.frame()
# sub_grade int_rate
# 1 A1 0.023
# 2 A2 0.027
# 3 A3 0.033
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With