Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get monthly means from dataframe of several years of daily temps

Tags:

r

I have daily temperature values for several years, 1949-2010. I would like to calculate monthly means. Here is an example of the data:

head(tmeasmax)
TIMESTEP   MEAN.C. MINIMUM.C. MAXIMUM.C. VARIANCE.C.2. STD_DEV.C.      SUM COUNT
1949-01-01  6.836547       6.65       7.33    0.02850574  0.1688364 1.426652     6
1949-01-02 10.533371      10.24      10.74    0.06140426  0.2477988 1.426652     6
1949-01-03 18.746729      18.02      19.78    0.18507860  0.4302076 1.426652     6
1949-01-04 21.244562      20.09      22.40    0.76106980  0.8723931 1.426652     6
1949-01-05  3.826716       3.11       5.37    0.52706647  0.7259935 1.426652     6
1949-01-06  9.127782       8.46      10.26    0.20236358  0.4498484 1.426652     6

str(tmeasmax)
'data.frame':   22645 obs. of  8 variables:
 $ TIMESTEP     : Date, format: "1949-01-01" "1949-01-02" ...
 $ MEAN.C.      : num  6.84 10.53 18.75 21.24 3.83 ...
 $ MINIMUM.C.   : num  6.65 10.24 18.02 20.09 3.11 ...
 $ MAXIMUM.C.   : num  7.33 10.74 19.78 22.4 5.37 ...
 $ VARIANCE.C.2.: num  0.0285 0.0614 0.1851 0.7611 0.5271 ...
 $ STD_DEV.C.   : num  0.169 0.248 0.43 0.872 0.726 ...
 $ SUM          : num  1.43 1.43 1.43 1.43 1.43 ...
 $ COUNT        : int  6 6 6 6 6 6 6 6 6 6 ...

There is a previous question that I couldn't make heads or tails of. I imagine I can probably use aggregate, but I don't know how to break up the dates into the years and months and then approach the nesting of the months inside the years. I tried a loop inside of a loop, but I can never get nested loops to work.

EDIT to reply to comments/questions: I was looking for the mean of "MEAN.C."

like image 932
shea Avatar asked Dec 09 '25 06:12

shea


2 Answers

Here's a quick data.table solution. I assuming you want the means of MEAN.C. (?)

library(data.table)
setDT(tmeasmax)[, .(MontlyMeans = mean(MEAN.C.)), by = .(year(TIMESTEP), month(TIMESTEP))]
#    year month MontlyMeans
# 1: 1949     1    11.71928

You can also do this for all the columns at once if you want

tmeasmax[, lapply(.SD, mean), by = .(year(TIMESTEP), month(TIMESTEP))]
#    year month  MEAN.C. MINIMUM.C. MAXIMUM.C. VARIANCE.C.2. STD_DEV.C.      SUM COUNT
# 1: 1949     1 11.71928     11.095   12.64667     0.2942481   0.482513 1.426652     6
like image 113
David Arenburg Avatar answered Dec 10 '25 19:12

David Arenburg


Here's a way to do it with the dplyr package:

library(dplyr)
library(lubridate)

tmeasmax$TIMESTEP = ymd(tmeasmax$TIMESTEP)

tmeasmax %>% 
  group_by(Year=year(TIMESTEP), Month=month(TIMESTEP)) %>%
  summarise(meanDailyMin=mean(MINIMUM.C.),
            meanDailyMean=mean(MEAN.C.))

  Year Month meanDailyMin meanDailyMean
1 1949     1       11.095      11.71928

You can summarise any other column by month in a similar way.

like image 35
eipi10 Avatar answered Dec 10 '25 20:12

eipi10



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!