Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How to filter/subset a sequence of dates

I have this data: (complete for December)

      date     sessions
1   2014-12-01  1932
2   2014-12-02  1828
3   2014-12-03  2349
4   2014-12-04  8192
5   2014-12-05  3188
6   2014-12-06  3277

And a need to subset/filter this, for example from "2014-12-05" to "2014-12-25"

I know that you can create a sequence with the operator ":".

Example: b <- c(1:5)

But How to filter a sequence? I tried this

NewDate <- filter(Dates, date("2014-12-05":"2014-12-12"))

But says:

Error: unexpected symbol in: "NewDate <- filter(Dates, date("2014-12-05":"2014-12-12") NewDate"

like image 969
Omar Gonzales Avatar asked Feb 05 '15 03:02

Omar Gonzales


People also ask

How do you subset a range?

table object using a range of values, we can use single square brackets and choose the range using %between%. For example, if we have a data. table object DT that contains a column x and the values in x ranges from 1 to 10 then we can subset DT for values between 3 to 8 by using the command DT[DT$x %between% c(3,8)].

What is the difference between subset and filter in R?

subset recycles its condition argument. filter supports conditions as separate arguments. filter preserves the class of the column.


Video Answer


3 Answers

you could use subset

Generating your sample data:

temp<-
read.table(text="date     sessions
2014-12-01  1932
2014-12-02  1828
2014-12-03  2349
2014-12-04  8192
2014-12-05  3188
2014-12-06  3277", header=T)

Making sure it's in date format:

temp$date <- as.Date(temp$date, format= "%Y-%m-%d")

temp



 #        date sessions
 # 1 2014-12-01     1932
 # 2 2014-12-02     1828
 # 3 2014-12-03     2349
 # 4 2014-12-04     8192
 # 5 2014-12-05     3188
 # 6 2014-12-06     3277

Using subset :

subset(temp, date> "2014-12-03" & date < "2014-12-05")

which gives:

  #        date sessions
  # 4 2014-12-04     8192

you could also use []:

temp[(temp$date> "2014-12-03" & temp$date < "2014-12-05"),]
like image 130
jalapic Avatar answered Oct 18 '22 00:10

jalapic


If you want to use dplyr, you can try something like this.

mydf <- structure(list(date = structure(c(16405, 16406, 16407, 16408, 
16409, 16410), class = "Date"), sessions = c(1932L, 1828L, 2349L, 
8192L, 3188L, 3277L)), .Names = c("date", "sessions"), row.names = c("1", 
"2", "3", "4", "5", "6"), class = "data.frame")

# Create date object
mydf$date <- as.Date(mydf$date) 

filter(mydf, between(date, as.Date("2014-12-02"), as.Date("2014-12-05")))

#If you avoid using `between()`, the code is simpler.

filter(mydf, date >= "2014-12-02", date <= "2014-12-05")
filter(mydf, date >= "2014-12-02" & date <= "2014-12-05")

#        date sessions
#1 2014-12-02     1828
#2 2014-12-03     2349
#3 2014-12-04     8192
#4 2014-12-05     3188
like image 32
jazzurro Avatar answered Oct 18 '22 02:10

jazzurro


An option using data.table

 library(data.table)
 setDT(df)[date %between% c('2014-12-02', '2014-12-05')]
 #         date sessions
 #1: 2014-12-02     1828
 #2: 2014-12-03     2349
 #3: 2014-12-04     8192
 #4: 2014-12-05     3188

This should work even if the "date" is "character" column

 df$date <- as.character(df$date)
 setDT(df)[date %between% c('2014-12-02', '2014-12-05')]
 #       date sessions
 #1: 2014-12-02     1828
 #2: 2014-12-03     2349
 #3: 2014-12-04     8192
 #4: 2014-12-05     3188

In case if we wanted to subset exclusive of the range

  setDT(df)[between(date, '2014-12-02', '2014-12-05', incbounds=FALSE)]
  #         date sessions
  #1:  2014-12-03     2349
  #2:  2014-12-04     8192

data

 df <-  structure(list(date = structure(c(16405, 16406, 16407, 16408, 
 16409, 16410), class = "Date"), sessions = c(1932L, 1828L, 2349L, 
 8192L, 3188L, 3277L)), .Names = c("date", "sessions"), row.names = c("1", 
 "2", "3", "4", "5", "6"), class = "data.frame")
like image 11
akrun Avatar answered Oct 18 '22 02:10

akrun