Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr group_by throw error on variable not in the function

I am using R 3.4.0 and dplyr 0.5.0 (I also have tested using R 3.3.3 and I have the same error).

I have been using this following type of code regularly in the past (even yesterday!) but for some reasons it creates an error today.

For instance, I have data on a 5 minutes interval that I want to summarize by 15 minutes. Since I cannot group_by DateTime POSIXlt, I transform the variable into character. However, when I apply the group_by function, it creates an error on the original DateTime POSIXlt variable, even though I have used the character variable in the function.

Here is a reproducible example:

z <- seq(ISOdatetime(2017,01,01, 00,00,00), ISOdatetime(2017,02,28,23,45,00), by="5 min")
q <- rnorm(16990, mean=120, sd=75)

d<- data.frame("Dates"=z, "values"=q)

# Round the time to the nearest 15min
d$DatesRound <- as.POSIXlt(round(as.double(d$Dates)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))

# Transform into character
d$DatesRoundChar <- as.character(d$DatesRound)

d2 <-
  d %>%
  group_by(DatesRoundChar)%>%
  summarise(total=sum(values))

And here is the error I have:

Error in grouped_df_impl(data, unname(vars), drop) : column 'DatesRound' has unsupported class : POSIXlt, POSIXt

I have also tried transforming using :

d$DatesRoundChar <- strftime(as.POSIXct(d$DatesRound))
d$DatesRoundChar <- sapply(d$DatesRound, as.character)

But still I have the same error.

Does anyone know why it throw an error on a variable that is not even in the function? And how can I fix it?

like image 339
Catherine Gladu Avatar asked Feb 05 '23 12:02

Catherine Gladu


1 Answers

the POSIXlt class is creating the trouble in the dplyr chain as it is an unsupported class in dplyr

d %>% 
   group_by(DatesRoundChar)

Error in grouped_df_impl(data, unname(vars), drop) : Column DatesRound: unsupported class POSIXlt/POSIXt

traceback()
#14: stop(list(message = "Column `DatesRound`: unsupported class POSIXlt/POSIXt", 
#        call = grouped_df_impl(data, unname(vars), drop), cppstack = NULL))
#13: .Call("dplyr_grouped_df_impl", PACKAGE = "dplyr", data, symbols, 
#        drop)
#12: grouped_df_impl(data, unname(vars), drop)
#11: grouped_df(groups$data, groups$group_names)
#10: group_by.data.frame(., DatesRoundChar)
#9: group_by(., DatesRoundChar)
#8: function_list[[k]](value)
#7: withVisible(function_list[[k]](value))
#6: freduce(value, `_function_list`)
#5: `_fseq`(`_lhs`)
#4: eval(expr, envir, enclos)
#3: eval(quote(`_fseq`(`_lhs`)), env, env)
#2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
#1: d %>% group_by(DatesRoundChar)

instead we can change it to POSIXct with as.POSIXct

d$DatesRound <- as.POSIXct(round(as.double(d$Dates)/(15*60))*
                   (15*60),origin=(as.POSIXlt('1970-01-01')))

Or another option is to remove the 'DatesRound' column before the group_by

d %>% 
  select(-DatesRound) %>% 
  group_by(DatesRoundChar) %>%
  summarise(total=sum(values))
like image 71
akrun Avatar answered Feb 07 '23 08:02

akrun