Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Lubridate Returns Unwanted Century When Given Two Digit Year

In R, I have a vector of strings representing dates in two different formats:

  1. "month/day/year"
  2. "month day, year"

The first format has a two digit year so my vector looks something like this:

c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979",...)

I want to put the dates in the vector in a standard format. This should be easy with the mdy function from the lubridate package, except when I pass it the first format, it returns an unwanted century.

mdy("3/18/75") returns "2075-03-18 UTC"

Does anyone know how it can return the date in the 20th century? That is "1975-03-18 UTC". Any other solution of how to standardize the dates will be greatly appreciated as well.

I am running version lubridate_1.3.3 if that matters.

like image 541
pseudorandom Avatar asked Oct 19 '15 18:10

pseudorandom


People also ask

What does Lubridate do in R?

Lubridate is an R package that makes it easier to work with dates and times.

What is Lubridate used for?

lubridate: Make Dealing with Dates a Little EasierFunctions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects.

What is DMY R?

lubridate::dmy() creates a date-object from a string in the DMY format. When you print a date object, it is by default shown in the ISO 8601 format aka YYYY-MM-DD. To print the date in the DMY format, you can use format(date, "%d/%m/%y") (note that this will convert the date object to a string).


2 Answers

You could do it like this:

some_dates <- c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979")
dates <- mdy(some_dates)
future_dates <- year(dates) > year(Sys.Date())
year(dates[future_dates]) <- year(dates[future_dates]) - 100

Maybe a better approach would be to remove the ambiguity from your date strings though -- otherwise your code will be wrong when 2075 rolls around ;)

library(stringr)
some_dates <- c('3/18/75', '01/09/53')
str_replace(some_dates, '[0-9]+$', '19\\0')

Or if the two date strings are mixed:

some_dates <- c("3/18/75", "March 10, 1994", "10/1/80", "June 15, 1979")
str_replace(some_dates, '/([0-9]{2}$)', '/19\\1')
like image 189
DunderChief Avatar answered Oct 16 '22 23:10

DunderChief


lubridate v1.7.4 does. Looking at a 2068 as we speak

like image 31
Bruce Avatar answered Oct 16 '22 23:10

Bruce