I don't often have to work with dates in R, but I imagine this is fairly easy. I have a column that represents a date in a dataframe. I simply want to create a new dataframe that summarizes a 2nd column by Month/Year using the date. What is the best approach?
I want a second dataframe so I can feed it to a plot.
Any help you can provide will be greatly appreciated!
EDIT: For reference:
> str(temp) 'data.frame': 215746 obs. of 2 variables: $ date : POSIXct, format: "2011-02-01" "2011-02-01" "2011-02-01" ... $ amount: num 1.67 83.55 24.4 21.99 98.88 ... > head(temp) date amount 1 2011-02-01 1.670 2 2011-02-01 83.550 3 2011-02-01 24.400 4 2011-02-01 21.990 5 2011-02-03 98.882 6 2011-02-03 24.900
Basically, the code will define the month and year of each monthly value in separate columns. Then, it will create a vector of daily data using the minimum and maximum dates in your monthly data, and will create two separate columns for year and month for the daily data as well.
I'd do it with lubridate
and plyr
, rounding dates down to the nearest month to make them easier to plot:
library(lubridate) df <- data.frame( date = today() + days(1:300), x = runif(300) ) df$my <- floor_date(df$date, "month") library(plyr) ddply(df, "my", summarise, x = mean(x))
There is probably a more elegant solution, but splitting into months and years with strftime()
and then aggregate()
ing should do it. Then reassemble the date for plotting.
x <- as.POSIXct(c("2011-02-01", "2011-02-01", "2011-02-01")) mo <- strftime(x, "%m") yr <- strftime(x, "%Y") amt <- runif(3) dd <- data.frame(mo, yr, amt) dd.agg <- aggregate(amt ~ mo + yr, dd, FUN = sum) dd.agg$date <- as.POSIXct(paste(dd.agg$yr, dd.agg$mo, "01", sep = "-"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With