Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Format dates to Month-Year while keeping class Date

Tags:

date

r

I feel like there's a pretty simple way to do this, but I'm not finding it easily...

I am working with R to extract data from a dataset and them summarize it by a number of different characteristics. One of them is the month in which an event is scheduled / has occurred. We have the exact date of the event in the database, something like this:

person_id    date_visit
1            2012-05-03
2            2012-08-13
3            2012-12-12
...

I would like to use the table() function to generate a summary table that would look something like this:

Month    Freq
Jan 12   1
Feb 12   2
Mar 12   1
Apr 12   3
...

My issue is this. I've read the data in and used as.Date() to convert character strings to dates. I can use format.Date() to get the dates formatted as Jan 12, Mar 12, etc. But when you use format.Date(), you end up with character strings again. This means when you apply table() to them, they come out in alphabetical order (my current set is Aug 12, Jul 12, Jun 12, Mar 12, and so forth).

I know that in SAS, you could use a format to change the appearance of a date, while preserving it as a date (so you could still do date operators on it). Can the same thing be done using R?

My plan is to build a nice data frame through a number of steps, and then (after making sure that all the dates are converted to strings, for compatibility reasons) use xtable() to make a nice LaTeX output.

Here's my code at present.

load("temp.RData")
ds$date_visit <- as.Date(ds$date_visit,format="%Y-%m-%d")
table(format.Date(safebeat_recruiting$date_baseline,format="%b %Y"))

ETA: I'd prefer to just do it in Base R if I can, but if I have to I can always use an additional package.

like image 946
TARehman Avatar asked Jun 22 '12 16:06

TARehman


2 Answers

You could use the yearmon class from the zoo package

require("zoo")
ds <- data.frame(person_id=1:3, date_visit=c("2012-05-03", "2012-08-13", "2012-12-12"))
ds$date_visit <- as.yearmon(ds$date_visit)
ds
  person_id date_visit
1         1   May 2012
2         2   Aug 2012
3         3   Dec 2012
like image 62
GSee Avatar answered Oct 13 '22 09:10

GSee


month.abb is a constant vector in R and can be used to sort on the first three letter of the string of names for the table.

ds <- data.frame(person_id=1:3, date_visit=as.Date(c("2012-05-03", "2012-08-13", "2012-12-12")))
table(format( ds$date_visit, format="%b %Y"))
tbl <- table(format( ds$date_visit, format="%b %Y"))
tbl[order(  match(substr(names(tbl), 1,3), month.abb) )]

May 2012 Aug 2012 Dec 2012 
       1        1        1 

With additional years you would see the "May"s all together so this would be needed:

 tbl[order( substr(names(tbl), 5,8),  match(substr(names(tbl), 1,3), month.abb) )]
like image 28
IRTFM Avatar answered Oct 13 '22 10:10

IRTFM