Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

date_breaks {scales} shifts date scale in ggplot

As seen in the sample dataset below, dates are between 2015-01-01 and 2015-08-01. However when I use the date_breaks function, the month labels in x-axis are shifted. Could you suggest a solution?

dfn <- read.table(header=T, text='
supp p_date length
OJ  2015-01-01  13.23
OJ  2015-03-01  22.70
OJ  2015-08-01  26.06
VC  2015-01-01   7.98
VC  2015-03-01  16.77
VC  2015-08-01  26.14
                  ')

dfn$p_date <- as.Date(dfn$p_date, "%Y-%m-%d")

library(ggplot2)
library(scales)
ggplot(dfn, aes(as.POSIXct(p_date), length, colour = factor(supp))) + 
    geom_line(size=1.3) +
    labs(colour="Lines :", x = "", y = "") +
    guides(colour = guide_legend(override.aes = list(size=5))) +  
    scale_x_datetime(breaks = date_breaks("1 months"),labels = date_format("%m/%y"))

enter image description here

Here is my sessionInfo

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253    LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                  LC_TIME=Greek_Greece.1253    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] scales_0.3.0  ggplot2_1.0.1

loaded via a namespace (and not attached):
 [1] labeling_0.3     MASS_7.3-44      colorspace_1.2-6 magrittr_1.5     plyr_1.8.3       tools_3.2.2      gtable_0.1.2     reshape2_1.4.1  
 [9] Rcpp_0.12.1      stringi_0.5-5    grid_3.2.2       stringr_1.0.0    digest_0.6.8     proto_0.3-10     munsell_0.4.2  
like image 507
George Dontas Avatar asked Dec 18 '22 22:12

George Dontas


1 Answers

I noticed similar (but not identical) issues with the date format, that appears to be down to the use of timezones in the use of the format used in date_format().

Your base plot

library(ggplot2)
library(scales)

p <- ggplot(dfn, aes(as.POSIXct(p_date), length, colour = factor(supp))) + 
          geom_point(size=3) +
          geom_line(size=1.3) +
          labs(colour="Lines :", x = "", y = "") +
          guides(colour = guide_legend(override.aes = list(size=5))) +
          theme(axis.text.x=element_text(size=15))

p + scale_x_datetime(breaks = date_breaks("1 month"), 
                                           labels = date_format("%m/%Y"))

Which produces

enter image description here

Note that March has been added twice, and this appears to be due to using timezone information within the format function

date_format()(date_breaks()(as.POSIXct(dfn$p_date[c(1,3)])))
#[1] "2015-01-01" "2015-02-01" "2015-03-01" "2015-03-31" "2015-04-30" "2015-05-31"
#[7] "2015-06-30" "2015-07-31" "2015-08-31"

You can write your own format function so that it does not include any timezone information.

my_format <- function (format = "%Y-%m-%d") {
                        function(x) format(x, format)
                 }

And plot

p + scale_x_datetime(breaks = date_breaks("1 month"), 
                                           labels = my_format("%m/%Y"))

enter image description here

This format behaves more like i would expect

my_format()(date_breaks()(as.POSIXct(dfn$date2[c(1,3)])))
#[1] "2015-01-01" "2015-02-01" "2015-03-01" "2015-04-01" "2015-05-01" "2015-06-01"
#[7] "2015-07-01" "2015-08-01" "2015-09-01"

So not really an explanation, as my understanding of POSIX functions is minimal, but does suggest a work around.

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: i686-pc-linux-gnu (32-bit)
Running under: Ubuntu 14.04.3 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] datasets  utils     stats     graphics  grDevices grid      methods   base     

other attached packages:
[1] scales_0.3.0       data.table_1.9.6   Hmisc_3.17-0       Formula_1.2-1      survival_2.38-3   
[6] lattice_0.20-33    MASS_7.3-44        gridExtra_2.1.0    ggplot2_1.0.1.9003

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.1         digest_0.6.8        chron_2.3-47        plyr_1.8.3          gtable_0.1.2       
 [6] acepack_1.3-3.3     latticeExtra_0.6-26 rpart_4.1-10        labeling_0.3        proto_0.3-10       
[11] splines_3.2.2       RColorBrewer_1.1-2  tools_3.2.2         foreign_0.8-66      munsell_0.4.2      
[16] colorspace_1.2-6    cluster_2.0.3       nnet_7.3-11  
like image 112
user20650 Avatar answered Dec 21 '22 12:12

user20650