I have data which includes variables for the hour, the minute, and the second for each observation. I want to count the number of observations before 3am, all observations before 6am, all observations before 9am and so on. Any help on this would be hugely appreciated.
Example of the data:
day hour minute second
01 17 10 03
01 17 14 20
01 17 25 27
01 17 32 39
01 17 33 40
01 17 34 10
01 17 34 14
01 17 34 16
01 17 34 21
01 17 34 23
01 17 34 25
01 17 34 31
01 17 34 36
I have about 300,000 observations like this.
hour : int 17 17 17 17 17 17 17 17 17 17
minute: int 10 14 25 32 33 34 34 34 34 34
second: int 3 20 27 39 40 10 14 16 21 23
count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) .
Just use table(Data$ID) or as. data. frame(table(Data$ID)) if you want a data. frame back.
One approach is to create a new variable based on your binning criteria, then tabulate on that variable:
set.seed(1)
dat <- data.frame(hour = sample(0:23, 100, TRUE, prob = runif(24)),
minute = sample(0:59,100, TRUE, prob = runif(60)),
second = sample(0:59,100, TRUE, prob = runif(60)))
#Adjust bins accordingly
dat <- transform(dat, bin = ifelse(hour < 3,"Before 3",
ifelse(hour < 6,"Before 6",
ifelse(hour <9,"Before 9","Later in day"))))
as.data.frame(table(dat$bin))
Var1 Freq
1 Before 3 7
2 Before 6 17
3 Before 9 19
4 Later in day 57
Depending on the number of bins you need, you may run into issues with the nested ifelse() statements, but that should give you a start. Update your question with more details if you get stuck.
How about length(which(data$hour <=2 ))
? I used 2 o'clock here to avoid having to deal with minutes and seconds in the first place. Then loop or apply
over all the different hours you want to count.
If you need to restart your count every day, then make use of the data$day value similarly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With