I have a dataframe (lets call it df1) that looks something like this...
Date Price
2014-08-06 22
2014-08-06 89
2014-09-15 56
2014-06-04 41
2015-01-19 11
2015-05-23 5
2014-07-21 108
There are other variables in the dataframe but we will ignore them for now, as I do not require them.
I have previously ordered it using
df2 <- df1[order(as.Date(df1$Date, format="%Y/%m/%d")),]
And then created a dataframe containing the values for just one month, for example, just September 2015 dates...
september2015 <- df2[df2$Date >= "2015-09-01" & df2$Date <= "2015-09-30",]
I have done this for all the months in 2015 and 2014. Then I need to create an average of prices within each given month. I have done this by...
mean(september2015$Price, na.rm = TRUE)
Obviously, this is very long and tedious and involves many lines of code. I am trying to make my code more efficient through using the dplyr package.
So far I have...
datesandprices <- select(df2, Date, Price)
datesandprices <- arrange(datesandprices, Date)
summarise(datesandprices, avg = mean(Price, na.rm = TRUE))
Or in a simpler form...
df1 %>%
select(Date, Price) %>%
arrange(Date) %>%
filter(Date >= 2014-08-06 & Date =< 2014-08-30)
summarise(mean(Price, na.rm = TRUE))
The filter line is not working for me and I can't figure out how to filter by dates using this method. I would like to get the mean for each month without having to calculate it one by one - and ideally extract the monthly means into a new dataframe or column that looks like...
Month Average
Jan 2014 x
Feb 2014 y
...
Nov 2015 z
Dec 2015 a
I hope this makes sense. I can't find anything on stackoverflow that works with dates, attempting to do something similar to this (unless I am searching for the wrong functions). Many thanks!
I made a separate column in your data set that contains only year and month. Then, I did a group_by
on that column to get the means for each month.
Date <- c("2014-08-06", "2014-08-06", "2014-09-15", "2014-06-04", "2015-01-19", "2015-05-23", "2014-07-21")
Price <- c(22,89,56,41,11,5,108)
Date <- as.Date(Date, format="%Y-%m-%d")
df <- data.frame(Date, Price)
df$Month_Year <- substr(df$Date, 1,7)
library(dplyr)
df %>%
#select(Date, Price) %>%
group_by(Month_Year) %>%
summarise(mean(Price, na.rm = TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With