I have 20 years worth of weather data, but I'm only interested in the patterns per year. I don't care how June 1995 is different from June 2011, for example. Instead, I want to have 20 values for June 1, 20 values for June 2, etc.
My question: How do I drop the year portion of a date object, keep the month AND day, while also maintaining the sequential properties of dates? My ultimate goal is a long list of repeated mm/dd values corresponding each to the outcome variable. I'll treat the mm/dd like factors, but in the correct order.
# Given this:
as.Date(c("2014-06-01","1993-06-01", "2013-06-03", "1999-01-31"), "%Y-%m-%d")
# I want to get this:
"06-01" "06-01" "06-03" "01-31"
# That will sort like this
"01-31" "06-01" "06-01" "06-03"
Little hacks like using sub() to drop the year and convert the dash to a decimal doesn't work because then the 1st of the month is the same as the 10th of the month. I also tried turning the dates into character strings, removing the year, and then turning it back into a date... that just made everything year 2014.
POSIXct stores both a date and time with an associated time zone. The default time zone selected, is the time zone that your computer is set to which is most often your local time zone. POSIXct stores date and time in seconds with the number of seconds beginning at 1 January 1970.
You can use the as. Date( ) function to convert character data to dates. The format is as. Date(x, "format"), where x is the character data and format gives the appropriate format.
The as. Date() is a built-in R function that converts between character representations and class “Date” objects representing the calendar dates. Dates are represented as the number of days since 1970-01-01, with negative values for earlier dates.
Does this work?
temp<-as.Date(c("2014-06-01","1993-06-01", "2013-06-03", "1999-01-31"), "%Y-%m-%d")
x<-format(temp, format="%m-%d")
x
[1] "06-01" "06-01" "06-03" "01-31"
sort(x)
[1] "01-31" "06-01" "06-01" "06-03"
jalapic's answer just before mine, transforms the date column into a character vector (the object passed in to format is returned as a character for pretty printing).
according to the OP, one reason for getting rid of the year, perhaps the key one, is to roll-up by by day & month, regardless of year. To me, that suggests a time series is not the right data type for this column, instead you are better off with an ordered factor which will preserve the "sequential properties of dates" as OP requires.
this is pretty much the
Granted, a factor does not understand dates or numbers, but it does understand unique values, which in this instance at least, it should behave as the OP wants
> d = "2014-06-01"
> d = as.Date(d)
fnx = function(x) {
unlist(strsplit(as.character(x), '[19|20][0-9]{2}-', fixed=FALSE))[2]
}
> dm("2012-01-25")
[1] "01-25"
> dm1 = sapply(column_of_date_objs, fnx)
> new_col = as.factor(dm1, ordered=TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With