Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Format Date to Year-Month in R

Tags:

r

lubridate

I would like to retain my current date column in year-month format as date. It currently gets converted to chr format. I have tried as_datetime but it coerces all values to NA. The format I am looking for is: "2017-01"

library(lubridate)
df<- data.frame(Date=c("2017-01-01","2017-01-02","2017-01-03","2017-01-04",
                       "2018-01-01","2018-01-02","2018-02-01","2018-03-02"),
            N=c(24,10,13,12,10,10,33,45))
df$Date <- as_datetime(df$Date)
df$Date <- ymd(df$Date)
df$Date <- strftime(df$Date,format="%Y-%m")

Thanks in advance!

like image 633
AM_123 Avatar asked May 14 '18 20:05

AM_123


3 Answers

lubridate only handle dates, and dates have days. However, as alistaire mentions, you can floor them by month of you want work monthly:

library(tidyverse)

df_month <-
  df %>%
  mutate(Date = floor_date(as_date(Date), "month"))

If you e.g. want to aggregate by month, just group_by() and summarize().

df_month %>%
  group_by(Date) %>%
  summarize(N = sum(N)) %>%
  ungroup()

#> # A tibble: 4 x 2
#>  Date           N
#>  <date>     <dbl>
#>1 2017-01-01    59
#>2 2018-01-01    20
#>3 2018-02-01    33
#>4 2018-03-01    45
like image 114
Mikael Poul Johannesson Avatar answered Oct 18 '22 21:10

Mikael Poul Johannesson


You can solve this with zoo::as.yearmon() function. Follows the solution:

library(tidyquant)
library(magrittr) 
library(dplyr)

df <- data.frame(Date=c("2017-01-01","2017-01-02","2017-01-03","2017-01-04",
                  "2018-01-01","2018-01-02","2018-02-01","2018-03-02"),
           N=c(24,10,13,12,10,10,33,45))
df %<>% mutate(Date = zoo::as.yearmon(Date))
like image 6
Wlademir Ribeiro Prates Avatar answered Oct 18 '22 21:10

Wlademir Ribeiro Prates


You can use cut function, and use breaks="month" to transform all your days in your dates to the first day of the month. So any date within the same month will have the same date in the new created column.

This is usefull to group all other variables in your data frame by month (essentially what you are trying to do). However cut will create a factor, but this can be converted back to a date. So you can still have the date class in your data frame.

You just can't get rid of the day in a date (because then, is not a date...). Afterwards you can create a nice format for axes or tables. For example:

true_date <-
  as.POSIXlt(
    c(
      "2017-01-01",
      "2017-01-02",
      "2017-01-03",
      "2017-01-04",
      "2018-01-01",
      "2018-01-02",
      "2018-02-01",
      "2018-03-02"
    ),
    format = "%F"
  )

df <-
  data.frame(
    Date = cut(true_date, breaks = "month"),
    N = c(24, 10, 13, 12, 10, 10, 33, 45)
  )

## here df$Date is a 'factor'. You could use substr to create a formated column
df$formated_date <- substr(df$Date, start = 1, stop = 7)

## and you can convert back to date class. format = "%F", is ISO 8601 standard date format

df$true_date <- strptime(x = as.character(df$Date), format = "%F")

str(df)
like image 2
David Martinez C. Avatar answered Oct 18 '22 22:10

David Martinez C.