Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert dd/mm/yy and dd/mm/yyyy to Dates

Tags:

regex

r

lubridate

I have some a character vector with dates in various formats like this

dates <- c("23/11/12", "20/10/2012", "22/10/2012" ,"23/11/12")

I want to convert these to Dates. I have tried the very good dmy from the lubridate package, but this does not work:

    dmy(dates)
[1] "0012-11-23 UTC" "2012-10-20 UTC" "2012-10-22 UTC" "0012-11-23 UTC"

It is treating the /12 year as if it is 0012.

So I now am trying regular expression to select each type and individually convert to dates using as.Date(). However the regular expression I have tried to select the dd/mm/yy only does not work.

dates[grep('[0-9]{2}/[0-9]{2}/[0-9]{2,2}', dates)]

returns

[1] "23/11/12"   "20/10/2012" "22/10/2012" "23/11/12"

I thought that the {2,2} should get a exactly 2 numbers and not all of them. I'm not very good at regular expression so any help will be appreciated.

Thanks

EDIT

What I actually have are three different types of date as below

dates <- c("23-Jul-2013", "23/11/12", "20/10/2012", "22/10/2012" ,"23/11/12")

And I want to convert these to dates

parse_date_time(dates,c('dmy'))

gives me

[1] "2013-07-23" "0012-11-23" "2012-10-20" "2012-10-22" "0012-11-23"

However, this is wrong and 0012 should be 2012. I would like (a fairly simple) solution to this.

One solution I now have (thanks to @plannapus)is to use regular expressions I actually ended up creating this function as I was still getting some cases where the lubridate approach was turning 12 into 0012

    asDateRegex <- function(dates, 
        #selects strings from the vector dates using regexes and converts these to Dates
        regexes = c('[0-9]{2}/[0-9]{2}/[0-9]{4}', #dd/mm/yyyy
            '[0-9]{2}/[0-9]{2}/[0-9]{2}$', #dd/mm/yy
            '[0-9]{2}-[[:alpha:]]{3}-[0-9]{4}'), #dd-mon-yyyy
        orders = 'dmy',
        ...){
        require(lubridate)
        new_dates <- as.Date(rep(NA, length(dates)))
        for(reg in regexes){
            new_dates[grep(reg, dates)] <- as.Date(parse_date_time(dates[grep(reg, dates)], order = orders))
        }
        new_dates
    }

asDateRegex (dates)
[1] "2012-10-20" "2013-07-23" "2012-11-23" "2012-10-22" "2012-11-23"

But this is not very elegant. Any better solutions?

like image 277
Tom Liptrot Avatar asked Oct 17 '13 11:10

Tom Liptrot


People also ask

How do you convert MM DD YY to date?

There is a formula that can quickly convert dd/mm/yyyy to mm/dd/yyyy date format. Select a blank cell next to the dates you want to convert, type this formula =DATE(VALUE(RIGHT(A9,4)), VALUE(MID(A9,4,2)), VALUE(LEFT(A9,2))), and drag fill handle over the cells which need to use this formula.

How do I change the date format from mm/dd/yyyy to mm dd yyyy?

First, pick the cells that contain dates, then right-click and select Format Cells. Select Custom in the Number Tab, then type 'dd-mmm-yyyy' in the Type text box, then click okay. It will format the dates you specify.

How do I convert date from mm/dd/yyyy to mm yyyy in Excel?

In an Excel sheet, select the cells you want to format. Press Ctrl+1 to open the Format Cells dialog. On the Number tab, select Custom from the Category list and type the date format you want in the Type box. Click OK to save the changes.


1 Answers

You can use parse_date_time from lubridate:

some.dates <- c("23/11/12", "20/10/2012", "22/10/2012" ,"23/11/12")
parse_date_time(some.dates,c('dmy'))
[1] "2012-11-23 UTC" "2012-10-20 UTC" "2012-10-22 UTC" "2012-11-23 UTC"

But , Note that the order of format is important :

some.dates <- c("20/10/2012","23/11/12",  "22/10/2012" ,"23/11/12")
parse_date_time(some.dates,c('dmY','dmy'))

[1] "2012-10-20 UTC" "2012-11-23 UTC" "2012-10-22 UTC" "2012-11-23 UTC"

EDIT

Internally parse_date_time is using guess_formats (which I guess uses some regular expressions):

guess_formats(some.dates,c('dmy'))
       dmy        dmy        dmy        dmy 
"%d/%m/%Y" "%d/%m/%y" "%d/%m/%Y" "%d/%m/%y" 

As mentioned in the comment you can use parse_date_time like this:

as.Date(dates, format = guess_formats(dates,c('dmy')))
like image 133
agstudy Avatar answered Sep 30 '22 06:09

agstudy