Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract date elements from POSIXlt and put into data frame in R

Tags:

r

My second question of the day and my last attempt to use R to clean up this data. Here's the sitrep:

I have a data frame that has a column which is a POSIXlt date type. I want to extract the day, month and year from that column and create 3 new columns called (cleverly) day, month and year.

The data frame looks like this:

order_id      dd_mmm_yy
   1          2005-07-28
   2          2007-03-04

I want to end up with this:

order_id      dd_mmm_yy    day   month   year
   1          2005-07-28    28     7     2005
   2          2007-03-04    4      3     2007

I've created a function to extract the day, month and year and return them in a list (or data frame, I've tried both).

extractdate = function (date) {
        day = format(date, format="%d")
        month = format(date, format="%m")
        year = format(date, format="%Y")

       list(day=day, month=month, year=year)
 }

Here's what I've tried based on an earlier problem and question:

cbind(orders, t(sapply(orders$dd_mmm_yy, extractdate)))

which gives me this:

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 5, 9

The t(sapply... by itself gives me this for some crazy reason:

      day         month       year       
sec   Character,5 Character,5 Character,5
min   Character,5 Character,5 Character,5
hour  Character,5 Character,5 Character,5
mday  Character,5 Character,5 Character,5
mon   Character,5 Character,5 Character,5
year  Character,5 Character,5 Character,5
wday  Character,5 Character,5 Character,5
yday  Character,5 Character,5 Character,5
isdst Character,5 Character,5 Character,5

What on earth is going on? Am I better off using something like Python or Java to do all the data manipulation I need to do on this data before bringing it into R for analysis?

like image 706
Dave Kincaid Avatar asked Nov 14 '11 18:11

Dave Kincaid


People also ask

How do I extract the year from a column in a Dataframe in R?

Method 2: Extract Year from a Column in a Dataframe To extract the year from the column, we will create a dataframe with date columns and then separate the year from DateTime using format() methods and extract the year and convert to a numeric format.

How do I get year from POSIXct in R?

To get the year from a date in R you can use the functions as. POSIXct() and format() . For example, here's how to extract the year from a date: 1) date <- as. POSIXct("02/03/2014 10:41:00", format = "%m/%d/%Y %H:%M:%S) , and 2) format(date, format="%Y") .


2 Answers

POSIXlt objects are a list of 9 components (see the Details section of ?POSIXlt for more information). Because the dd_mmm_yy column is POSIXlt, you don't need a function to extract the components. You can just extract the components by their names:

orders$day <- orders$dd_mmm_yy$mday        # day of month
orders$month <- orders$dd_mmm_yy$mon+1     # month of year (zero-indexed)
orders$year <- orders$dd_mmm_yy$year+1900  # years since 1900
orders
#   order_id  dd_mmm_yy day month year
# 1        1 2005-07-28  28     7 2005
# 2        2 2007-03-04   4     3 2007
like image 125
Joshua Ulrich Avatar answered Sep 23 '22 03:09

Joshua Ulrich


One liner using lubridate

require(plyr); require(lubridate)
mutate(mydf, date = ymd(dd_mmm_yy), day = day(date), 
  month = month(date), year = year(date))

  order_id  dd_mmm_yy       date day month year
1        1 2005-07-28 2005-07-28  28     7 2005
2        2 2007-03-04 2007-03-04   4     3 2007
like image 21
Ramnath Avatar answered Sep 23 '22 03:09

Ramnath