Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split date data (m/d/y) into 3 separate columns

Tags:

date

r

I need to convert date (m/d/y format) into 3 separate columns on which I hope to run an algorithm.(I'm trying to convert my dates into Julian Day Numbers). Saw this suggestion for another user for separating data out into multiple columns using Oracle. I'm using R and am throughly stuck about how to code this appropriately. Would A1,A2...represent my new column headings, and what would the format difference be with the "update set" section?

 update <tablename> set A1 = substr(ORIG, 1, 4), 
                       A2 = substr(ORIG, 5, 6), 
                       A3 = substr(ORIG, 11, 6), 
                       A4 = substr(ORIG, 17, 5); 

I'm trying hard to improve my skills in R but cannot figure this one...any help is much appreciated. Thanks in advance... :)

like image 673
Joey Avatar asked Nov 02 '10 14:11

Joey


3 Answers

I use the format() method for Date objects to pull apart dates in R. Using Dirk's datetext, here is how I would go about breaking up a date into its constituent parts:

datetxt <- c("2010-01-02", "2010-02-03", "2010-09-10")
datetxt <- as.Date(datetxt)
df <- data.frame(date = datetxt,
                 year = as.numeric(format(datetxt, format = "%Y")),
                 month = as.numeric(format(datetxt, format = "%m")),
                 day = as.numeric(format(datetxt, format = "%d")))

Which gives:

> df
        date year month day
1 2010-01-02 2010     1   2
2 2010-02-03 2010     2   3
3 2010-09-10 2010     9  10

Note what several others have said; you can get the Julian dates without splitting out the various date components. I added this answer to show how you could do the breaking apart if you needed it for something else.

like image 190
Gavin Simpson Avatar answered Oct 11 '22 16:10

Gavin Simpson


Given a text variable x, like this:

> x
[1] "10/3/2001"

then:

> as.Date(x,"%m/%d/%Y")
[1] "2001-10-03"

converts it to a date object. Then, if you need it:

> julian(as.Date(x,"%m/%d/%Y"))
[1] 11598
attr(,"origin")
[1] "1970-01-01"

gives you a Julian date (relative to 1970-01-01).

Don't try the substring thing...

See help(as.Date) for more.

like image 24
Spacedman Avatar answered Oct 11 '22 15:10

Spacedman


Quick ones:

  1. Julian date converters already exist in base R, see eg help(julian).

  2. One approach may be to parse the date as a POSIXlt and to then read off the components. Other date / time classes and packages will work too but there is something to be said for base R.

  3. Parsing dates as string is almost always a bad approach.

Here is an example:

datetxt <- c("2010-01-02", "2010-02-03", "2010-09-10")
dates <- as.Date(datetxt) ## you could examine these as well
plt <- as.POSIXlt(dates)  ## now as POSIXlt types
plt[["year"]] + 1900      ## years are with offset 1900
#[1] 2010 2010 2010
plt[["mon"]] + 1          ## and months are on the 0 .. 11 intervasl
#[1] 1 2 9
plt[["mday"]] 
#[1]  2  3 10
df <- data.frame(year=plt[["year"]] + 1900, 
                  month=plt[["mon"]] + 1, day=plt[["mday"]])
df
#  year month day
#1 2010     1   2
#2 2010     2   3
#3 2010     9  10

And of course

julian(dates)
#[1] 14611 14643 14862
#attr(,"origin")
#[1] "1970-01-01"
like image 31
Dirk Eddelbuettel Avatar answered Oct 11 '22 15:10

Dirk Eddelbuettel