Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What parameter must an R function have to use it within the mutate function from tidyverse?

Tags:

r

dplyr

tidyverse

A have a tibble with a column with strings representing hours and minutes. I want to tidy that column and convert the elements into integers representing just the minutes.

Those strings can have one of the following forms:

  • "5" (which means 5 minutes)
  • "XX min" (meaning xx minutes)
  • "X Std" (meaning x hours)
  • "X Std. YY min" (meaning x hours and yy minutes)

I wrote a function to convert those strings into minutes.

  • "5" should become 5.
  • "45 min" should become 45.
  • "2 Std" should become 120.
  • "1 Std. 30 min" should become 90.

This is what the function looks like:

convert_ZA_time <- function(string) {
    if (nchar(string) == 1) {
      result <- as.integer(string)
    }
    else if (endsWith(string, " Std")) {
      result <- as.integer(substring(string, 1, 1)) * 60
    }
    else if (endsWith(string, " min") && nchar(string) == 6) {
      result <- as.integer(substring(string, 1, 2))
    }
    else if (endsWith(string, " min") && nchar(string) > 6) {
      hour <- as.integer(gsub(" Std.*", "", string, perl = TRUE))
      minute_plus <- gsub("^\\d+ Std. ", "", string, perl = TRUE)
      minute <- as.integer(gsub(" min$", "", minute_plus))
      result <- hour * 60 + minute
    }
    else {result <- NA}
    return(result)
}

Testing with strings it works just fine:

convert_ZA_time("2 Std. 50 min")
# prints [1] 170

But when I try to use this function inside the tidyverse mutate function I get the following error:

df <- tibble(datestr = c("5", "45 min", "1 Std", "2 Std. 30 min"))
df2 <- df %>% mutate(minutes = convert_ZA_time(datestr))
# throws error: the condition has length > 1 and only the first element will be used

How do I have to change my function to use it within mutate correctly?

P.S. as I understand it: mutate takes every "datestr" and puts it into the function "convert_ZA_time". But apparently mutate puts a vector into the function?

Thanks for any help!

like image 770
Stefan Boehringer Avatar asked Feb 02 '26 02:02

Stefan Boehringer


1 Answers

Your function just isn't Vectorized yet.

convert_ZA_time(c("2 Std. 50 min", "3 Std. 50 min"))
# [1] 170 230
# Warning messages:
# 1: In if (nchar(string) == 1) { :
#   the condition has length > 1 and only the first element will be used
# 2: In if (endsWith(string, " Std")) { :
#   the condition has length > 1 and only the first element will be used

Fix:

convert_ZA_timev <- Vectorize(convert_ZA_time)
      
convert_ZA_timev(c("2 Std. 50 min", "3 Std. 50 min"))
# 2 Std. 50 min 3 Std. 50 min 
#           170           230 

Explanation

You have an if / else structure in your function, like this one:

fun <- function(x) if (x >= 0) "pos" else "neg"

When applied to a vector with length greater than one, it evaluates just the first element with a warning.

v <- -2:2

fun(v)
# [1] "neg"
# Warning message:
#   In if (x >= 0) "pos" else "neg" :
#   the condition has length > 1 and only the first element will be used

fun(v[1])
# [1] "neg"

Vectorization enables a function to deal with vectors.

funv <- Vectorize(fun)
funv(v)
# [1] "neg" "neg" "pos" "pos" "pos"
like image 135
jay.sf Avatar answered Feb 04 '26 15:02

jay.sf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!