Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summarise data in a function instead of subsetting

Tags:

r

aggregate

For a sample dataframe:

bout <- structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "02/02/2013", class = "factor"),
Time = structure(1:30, .Label = c("07:55:40", "07:55:50",
"07:56:00", "07:56:10", "07:56:20", "07:56:30", "07:56:40",
"07:56:50", "07:57:00", "07:57:10", "07:57:20", "07:57:30",
"07:57:40", "07:57:50", "07:58:00", "07:58:10", "07:58:20",
"07:58:30", "07:58:40", "07:58:50", "07:59:00", "07:59:10",
"07:59:20", "07:59:30", "07:59:40", "07:59:50", "08:00:00",
"08:00:10", "08:00:20", "08:00:30"), class = "factor"), Axis1 = c(0L,
0L, 100L, 500L, 233L, 155L, 60L, 0L, 0L, 115L, 80L, 878L,
158L, 0L, 13L, 0L, 0L, 25L, 10L, 45L, 33L, 43L, 655L, 498L,
41L, 151L, 404L, 436L, 28L, 0L), Latitude = c(56.52289678,
56.52291659, 56.52292762, 56.52295108, 56.52292694, 56.52292513,
56.5229401, 56.52294825, 56.52295531, 56.52296413, 56.52296976,
56.52292374, 56.52293053, 56.52292422, 56.52289636, 56.52288866,
56.52293357, 56.52290114, 56.5228365, 56.52280237, 56.52279844,
56.52281107, 56.52282589, 56.52279711, 56.52277008, 56.52278785,
56.52279951, 56.52269176, 56.52270186, 56.52269016), Longitude = c(-2.56573101,
-2.56578171, -2.56579263, -2.56578099, -2.56575181, -2.56574877,
-2.56575947, -2.5657653, -2.56577941, -2.56577104, -2.56577004,
-2.56576048, -2.56575937, -2.56582402, -2.56585538, -2.56579373,
-2.56572003, -2.56568263, -2.56568237, -2.56570739, -2.56570637,
-2.56571299, -2.56572322, -2.56566835, -2.56566237, -2.56569353,
-2.56571833, -2.56563307, -2.56565902, -2.56565666), area = structure(c(1L,
1L, 2L, 2L, 2L, 2L, 3L, 4L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 9L,
10L, 11L, 2L, 2L, 6L, 6L, 6L, 6L, 12L, 13L, 13L, 13L, 13L,
13L), .Label = c("E456", "E457", "E460", "E461", "E462",
"E463", "E465", "E468", "E469", "E470", "E471", "E478", "E479"
), class = "factor"), bout = c(0L, 0L, 1L, 1L, 1L,
1L, 1L, 0L, 0L, 2L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 0L)), .Names = c("Date",
"Time", "Axis1", "Latitude", "Longitude", "area", "bout"
), class = "data.frame", row.names = c(NA, -30L))

I want to create a summary table detailing the data for each activity bout (i.e. 1 - 3 in the table).

My programming skills are limited so I would simply subset the data and populate a table. I would first convert the dates to ones R can read:

bout$Date <- strptime(bout$Date, "%d/%m/%Y")
bout$Time <- strptime(bout$Time, "%H:%M:%S")

NB - If anyone can help me get the 'Time' function working I should be most grateful - R is adding todays date.

I then subset the data and use a few simple functions to calculate some summary data (i.e. date and time of first activity and its duration).

bout.1 <- subset(bout, bout==1)
min.Date.1 <- min(bout.1$Date)
min.Time.1 <- min(bout.1$Time)
max.Time.1 <- max(bout.1$Time)
time.bout.1 <- difftime(max.Time.1, min.Time.1)

... I would then populate a summary table and repeat for the different bouts.

How could I automate this to summary all the bouts in one function (there could be n number of bouts)?

Any help would be greatly appreciated.

like image 603
KT_1 Avatar asked Mar 23 '26 17:03

KT_1


1 Answers

Here's a solution, using plyr's ddply() function for aggregation, and chron for keeping time without dates. Note that ddply doesn't seem to work well with POSIXt dates, so those were converted using as.Date() which creates a column having class "Date".

bout$Date <- as.Date(bout$Date, origin = "1970-01-01", format = "%d/%m/%Y")
library(chron)
bout$Time <- times(as.character(bout$Time))

my.stats <- function(x) {
    min.Date <- min(x$Date)
    min.Time <- min(x$Time)
    max.Time <- max(x$Time)
    time.bout <- max.Time - min.Time
    return(data.frame(min.Date, min.Time, max.Time, time.bout))
}

library(plyr)
ddply(bout, .(bout), my.stats)

#   bout   min.Date min.Time max.Time time.bout
# 1    0 2013-02-02 07:55:40 08:00:30  00:04:50
# 2    1 2013-02-02 07:56:00 07:56:40  00:00:40
# 3    2 2013-02-02 07:57:10 07:58:00  00:00:50
# 4    3 2013-02-02 07:58:40 08:00:20  00:01:40
like image 170
Jason V Avatar answered Mar 26 '26 12:03

Jason V



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!