Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Column "rate" must be length 1 (a summary value), not 22906

Tags:

r

dplyr

I am having some troubles with the below code. It returns

"Error in summarise_impl(.data, dots) : Column rate must be length 1 (a summary value), not 22906"

Is there any problem with my code?

sub_grade is of type character and int_rate is numeric

results <- loan_data %>%
  select(credit_grade, sub_grade, int_rate, loan_amnt) %>%
  group_by(sub_grade) %>%
  summarise(
    rate = substr(int_rate * 100, 1, 4),
    nr_loans = n(),
    "&",
    percent1 = substr((nr_loans / a) * 100, 1, 5),
    klj = "&",
    Amount = sum(loan_amnt, na.rm = TRUE),
    klj1 = "&",
    percent2 = substr((Amount / total) * 100, 1, 5)
  )

The problem shows up only when I add the first variable rate.

Reproducible example:

sub_grade <- c("A1", "A2", "A3","A1","A3")
int_rate <– c(0.023, 0.027, 0.033,0.023,0.033)

what I want is

sub_grade.  int_rate
  1. A1. 0.023
  2. A2. 0.027
  3. A3. 0.033
like image 900
Erick Avatar asked Jan 27 '23 23:01

Erick


1 Answers

The problem is that dplyr::summarise expects/accepts one value per group. But the substr(int_rate*100, ...) in your code will return value for each row i.e. many value per group. You need to think of using some grouping functions like min, max, first, last etc as part of substr. Considering the sample data that OP has posted the solution could be as:

# Data
sub_grade <- c("A1", "A2", "A3","A1","A3")
int_rate <- c(0.023, 0.027,0.033,0.023,0.033)

loan_data <- data.frame(sub_grade, int_rate, stringsAsFactors = FALSE)

# Use dplyr to summarise on sub_grade
library(dplyr)
loan_data %>% group_by(sub_grade) %>%
  summarise(int_rate = first(int_rate)) %>%
  as.data.frame()

#   sub_grade int_rate
# 1        A1    0.023
# 2        A2    0.027
# 3        A3    0.033
like image 164
MKR Avatar answered Feb 12 '23 20:02

MKR