Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hourly mean of multiple variables in R data.frame?

I have the following code and am trying to find hourly mean of each variables (i.e., X,Y, and Z). my output should be a data.frame with hourlyDate column and mean hourly data of all the variables. Any way forward would be appreciated.

library(lubridate)

set.seed(123)

T <- data.frame(Datetime = seq(ymd_hms("2011-01-01 00:00:00"), to= ymd_hms("2011-12-31 00:00:00"), by = "5 min"),
                X = runif(104833, 5,10),Y = runif(104833, 5,10), Z = runif(104833, 5,10))
T$Date <- format(T$Datetime, format="%Y-%m-%d")
T$Hour <- format(T$Datetime, format = "%H")
T$Mints <- format(T$Datetime, format = "%M")
like image 518
Hydro Avatar asked Oct 29 '25 17:10

Hydro


1 Answers

Try:

library(lubridate)
library(dplyr)

set.seed(123)

T <- data.frame(Datetime = seq(ymd_hms("2011-01-01 00:00:00"), to= ymd_hms("2011-12-31 00:00:00"), by = "5 min"),
                X = runif(104833, 5,10),Y = runif(104833, 5,10), Z = runif(104833, 5,10))



T %>% mutate(hourlyDate = floor_date(Datetime,unit='hour')) %>%
      select(-Datetime) %>% group_by(hourlyDate) %>% 
      summarize(across(everything(),mean)) %>%
      ungroup()
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 8,737 x 4
#>    hourlyDate              X     Y     Z
#>    <dttm>              <dbl> <dbl> <dbl>
#>  1 2011-01-01 00:00:00  8.00  7.90  6.90
#>  2 2011-01-01 01:00:00  7.93  7.47  7.90
#>  3 2011-01-01 02:00:00  7.83  6.89  7.67
#>  4 2011-01-01 03:00:00  6.61  7.92  7.18
#>  5 2011-01-01 04:00:00  7.27  7.20  6.48
#>  6 2011-01-01 05:00:00  7.88  6.80  7.69
#>  7 2011-01-01 06:00:00  7.07  8.05  7.52
#>  8 2011-01-01 07:00:00  7.40  7.92  6.99
#>  9 2011-01-01 08:00:00  7.97  7.76  7.26
#> 10 2011-01-01 09:00:00  7.57  7.47  6.94
#> # ... with 8,727 more rows

Created on 2020-08-20 by the reprex package (v0.3.0)

like image 102
Waldi Avatar answered Oct 31 '25 07:10

Waldi