I have some data in CSV like:
"Timestamp", "Count"
"2009-07-20 16:30:45", 10
"2009-07-20 16:30:45", 15
"2009-07-20 16:30:46", 8
"2009-07-20 16:30:46", 6
"2009-07-20 16:30:46", 8
"2009-07-20 16:30:47", 20
I can read it into R using read.cvs. I'd like to plot:
"2009-07-20 16:30:45", 2 "2009-07-20 16:30:46", 3 "2009-07-20 16:30:47", 1
"2009-07-20 16:30:45", 12.5 "2009-07-20 16:30:46", 7.333 "2009-07-20 16:30:47", 20
Is there some way to do this (collect by second/min/etc & plot) in R?
To create a time series plot in Excel, first select the time (DateTime in this case) Column and then the data series (streamflow in this case) column. Next, click on the Insert ribbon, and then select Scatter. From scatter plot options, select Scatter with Smooth Lines as shown below.
Time plots A time plot is basically a line plot showing the evolution of the time series over time. We can use it as the starting point of the analysis to get some basic understanding of the data, for example, in terms of trend/seasonality/outliers, etc.
Time series graphs are created by plotting an aggregated value (either a count or a statistic, such as sum or average) on a time line. The values are aggregated using time intervals based on the time range in the data being plotted. The following time intervals are used on time series graphs: One decade.
Read your data, and convert it into a zoo object:
R> X <- read.csv("/tmp/so.csv")
R> X <- zoo(X$Count, order.by=as.POSIXct(as.character(X[,1])))
Note that this will show warnings because of non-unique timestamps.
Task 1 using aggregate
with length
to count:
R> aggregate(X, force, length)
2009-07-20 16:30:45 2009-07-20 16:30:46 2009-07-20 16:30:47
2 3 1
Task 2 using aggregate
:
R> aggregate(X, force, mean)
2009-07-20 16:30:45 2009-07-20 16:30:46 2009-07-20 16:30:47
12.500 7.333 20.000
Task 3 can be done the same way by aggregating up to higher-order indices. You can call plot
on the result from aggregate:
plot(aggregate(X, force, mean))
Averaging the data is easy with the plyr package.
library(plyr)
Second <- ddply(dataset, "Timestamp", function(x){
c(Average = mean(x$Count), N = nrow(x))
})
To do the same thing by minute or hour, then you need to add fields with that info.
library(chron)
dataset$Minute <- minutes(dataset$Timestamp)
dataset$Hour <- hours(dataset$Timestamp)
dataset$Day <- dates(dataset$Timestamp)
#aggregate by hour
Hour <- ddply(dataset, c("Day", "Hour"), function(x){
c(Average = mean(x$Count), N = nrow(x))
})
#aggregate by minute
Minute <- ddply(dataset, c("Day", "Hour", "Minute"), function(x){
c(Average = mean(x$Count), N = nrow(x))
})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With