I struggle with dates and times in R, but I am hoping this is a fairly basic task.
Here is my dataset:
> str(temp.df)
'data.frame': 74602 obs. of 2 variables:
$ time : POSIXct, format: "2011-04-09 03:53:20" "2011-04-09 03:53:15" "2011-04-09 03:53:07" "2011-04-09 03:52:39" ...
$ value: num 1 1 1 1 1 1 1 1 1 1 ...
> head(temp.df$time, n=10)
[1] "2011-04-09 03:53:20 EDT" "2011-04-09 03:53:15 EDT" "2011-04-09 03:53:07 EDT" "2011-04-09 03:52:39 EDT"
[5] "2011-04-09 03:52:29 EDT" "2011-04-09 03:51:56 EDT" "2011-04-09 03:51:54 EDT" "2011-04-09 03:51:46 EDT"
[9] "2011-04-09 03:51:44 EDT" "2011-04-09 03:51:26 EDT"
and for convenience...
> dput(head(temp.df$time, n=10))
structure(c(1302335600, 1302335595, 1302335587, 1302335559, 1302335549,
1302335516, 1302335514, 1302335506, 1302335504, 1302335486), class = c("POSIXct",
"POSIXt"), tzone = "")
What I am looking to do:
Any help you can provide will be greatly appreciated
Binning is the process of transforming numerical or continuous data into categorical data. It is a common data pre-processing step of the model building process. rbin has the following features: manual binning using shiny app. equal length binning method.
R provides several options for dealing with date and date/time data. The builtin as. Date function handles dates (without times); the contributed library chron handles dates and times, but does not control for time zones; and the POSIXct and POSIXlt classes allow for dates and times with control for time zones.
To create a Date object from a simple character string in R, you can use the as. Date() function. The character string has to obey a format that can be defined using a set of symbols (the examples correspond to 13 January, 1982): %Y : 4-digit year (1982)
bins - Cuts points in vector x into evenly distributed groups (bins). bins takes 3 separate approaches to generating the cuts, picks the one resulting in the least mean square deviation from the ideal cut - length(x) / target. bins points in each bin - and then merges small bins unless excat.
Use the proper time series packages zoo and/or xts. This example is straight from the help pages of aggregate.zoo()
which aggregates POSIXct seconds data every 10 minutes
tt <- seq(10, 2000, 10)
x <- zoo(tt, structure(tt, class = c("POSIXt", "POSIXct")))
aggregate(x, time(x) - as.numeric(time(x)) %% 600, mean)
The to.period()
function in xts is also a sure winner. There are countless examples here on SO and on the r-sig-finance list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With