Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I count the number of observations at given intervals in R?

Tags:

r

I have data which includes variables for the hour, the minute, and the second for each observation. I want to count the number of observations before 3am, all observations before 6am, all observations before 9am and so on. Any help on this would be hugely appreciated.

Example of the data:

day    hour    minute   second
01       17        10       03
01       17        14       20
01       17        25       27
01       17        32       39
01       17        33       40
01       17        34       10
01       17        34       14
01       17        34       16
01       17        34       21
01       17        34       23
01       17        34       25
01       17        34       31
01       17        34       36

I have about 300,000 observations like this.

hour : int 17 17 17 17 17 17 17 17 17 17

minute: int 10 14 25 32 33 34 34 34 34 34

second: int 3 20 27 39 40 10 14 16 21 23

like image 271
HFC Avatar asked Feb 23 '12 18:02

HFC


People also ask

How do I count observations in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) .

How do I count observations by ID in R?

Just use table(Data$ID) or as. data. frame(table(Data$ID)) if you want a data. frame back.


2 Answers

One approach is to create a new variable based on your binning criteria, then tabulate on that variable:

set.seed(1)
dat <- data.frame(hour = sample(0:23, 100, TRUE, prob = runif(24)),
                  minute = sample(0:59,100, TRUE, prob = runif(60)),
                  second = sample(0:59,100, TRUE, prob = runif(60)))

#Adjust bins accordingly
dat <- transform(dat, bin = ifelse(hour < 3,"Before 3",
                                   ifelse(hour < 6,"Before 6",
                                          ifelse(hour <9,"Before 9","Later in day"))))

as.data.frame(table(dat$bin))
          Var1 Freq
1     Before 3    7
2     Before 6   17
3     Before 9   19
4 Later in day   57

Depending on the number of bins you need, you may run into issues with the nested ifelse() statements, but that should give you a start. Update your question with more details if you get stuck.

like image 92
Chase Avatar answered Sep 27 '22 21:09

Chase


How about length(which(data$hour <=2 )) ? I used 2 o'clock here to avoid having to deal with minutes and seconds in the first place. Then loop or apply over all the different hours you want to count.

If you need to restart your count every day, then make use of the data$day value similarly.

like image 37
Carl Witthoft Avatar answered Sep 27 '22 20:09

Carl Witthoft