Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid the connection lines in geom_line or geom_path when there is no data?

Tags:

r

ggplot2

I'm ploting a time series of mean values with geom_path and adding a ribbon with min max values with geom_ribbon. There are some gaps of data in the time series but the plot keep connecting the lines. In the attached plot the last panel show the gaps in the data. That data doesn't have x or y entries. Any way to control this?enter image description here

This is my plot code for the top panels:

ggplot(stat_total, aes(color=gas)) + 
  geom_path(aes(x=date_mean, y=conc_mean, color=gas), size=1.2, na.rm = T) +          
  geom_ribbon(aes(x=date_mean, ymin=conc_min, ymax=conc_max, fill=gas), color="grey70", alpha=0.4, na.rm = T)+
      scale_x_datetime(date_breaks = "3 weeks" , date_labels = "%d-%b") + 
      xlab(NULL) +
      ylab('[ppb]') + 
      theme_bw() +
      facet_wrap(gas~.,scales = 'free_x',ncol = 1,nrow=2)

And a sample of the data:

structure(list(day = c(6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 
11L, 11L, 12L, 12L, 13L, 13L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 
18L, 20L, 20L, 21L, 21L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 28L, 
1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 12L, 12L, 13L, 13L, 14L, 
14L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 18L, 19L, 19L, 20L, 20L, 
22L, 22L, 23L, 23L, 24L, 24L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 
28L, 29L, 29L, 30L, 30L, 31L, 31L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 
5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 11L, 11L, 12L, 12L, 13L, 
13L, 26L, 26L, 27L, 27L, 28L, 28L, 29L, 29L, 30L, 30L, 1L, 1L, 
2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 14L, 14L, 15L, 15L, 16L, 16L, 
17L, 17L, 18L, 18L, 19L, 19L, 20L, 20L, 21L, 21L, 22L, 22L, 23L, 
23L, 24L, 24L, 25L, 25L, 26L, 26L, 27L, 27L, 28L, 28L, 29L, 29L, 
30L, 30L, 31L, 31L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 
6L, 7L, 7L, 8L, 8L, 9L, 9L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 
18L, 19L, 19L, 20L, 20L, 21L, 21L, 22L, 22L, 24L, 24L, 25L, 25L, 
26L, 26L), month = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 
5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6), 
    gas = c("AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", "BVOC", "AVOC", 
    "BVOC", "AVOC", "BVOC"), date_mean = structure(c(1549475100, 
    1549475100, 1549542360, 1549542360, 1549620787.5, 1549620787.5, 
    1549710663.15789, 1549710663.15789, 1549801042.10526, 1549801042.10526, 
    1549885680, 1549885680, 1549971100, 1549971100, 1550052300, 
    1550052300, 1550263680, 1550263680, 1550312871.42857, 1550312871.42857, 
    1550406436.36364, 1550406436.36364, 1550475600, 1550475600, 
    1550686320, 1550686320, 1550756700, 1550756700, 1551105981.81818, 
    1551105981.81818, 1551177428.57143, 1551177428.57143, 1551260700, 
    1551260700, 1551351176.47059, 1551351176.47059, 1551442263.15789, 
    1551442263.15789, 1551537771.42857, 1551537771.42857, 1551617052.63158, 
    1551617052.63158, 1551703500, 1551703500, 1551782925, 1551782925, 
    1552427550, 1552427550, 1552499742.85714, 1552499742.85714, 
    1552551075, 1552551075, 1552645800, 1552645800, 1552737120, 
    1552737120, 1552830942.85714, 1552830942.85714, 1552885875, 
    1552885875, 1553019075, 1553019075, 1553065457.14286, 1553065457.14286, 
    1553274000, 1553274000, 1553350725, 1553350725, 1553430857.14286, 
    1553430857.14286, 1553519076.92308, 1553519076.92308, 1553572800, 
    1553572800, 1553714100, 1553714100, 1553774717.64706, 1553774717.64706, 
    1553857842.85714, 1553857842.85714, 1553942057.14286, 1553942057.14286, 
    1553995800, 1553995800, 1554210800, 1554210800, 1554313000, 
    1554313000, 1554383442.85714, 1554383442.85714, 1554463080, 
    1554463080, 1554551672.72727, 1554551672.72727, 1554640740, 
    1554640740, 1554723600, 1554723600, 1554809760, 1554809760, 
    1555006320, 1555006320, 1555067250, 1555067250, 1555150950, 
    1555150950, 1556319600, 1556319600, 1556373600, 1556373600, 
    1556453400, 1556453400, 1556533800, 1556533800, 1556646300, 
    1556646300, 1556707628.57143, 1556707628.57143, 1556797800, 
    1556797800, 1556888123.07692, 1556888123.07692, 1556974800, 
    1556974800, 1557050072.72727, 1557050072.72727, 1557869400, 
    1557869400, 1557914563.63636, 1557914563.63636, 1558005726.31579, 
    1558005726.31579, 1558092937.5, 1558092937.5, 1558178600, 
    1558178600, 1558265611.76471, 1558265611.76471, 1558351376.47059, 
    1558351376.47059, 1558436400, 1558436400, 1558525050, 1558525050, 
    1558612164.70588, 1558612164.70588, 1558699300, 1558699300, 
    1558783320, 1558783320, 1558874400, 1558874400, 1558935600, 
    1558935600, 1559079900, 1559079900, 1559128950, 1559128950, 
    1559216747.36842, 1559216747.36842, 1559301900, 1559301900, 
    1559387300, 1559387300, 1559474258.82353, 1559474258.82353, 
    1559561717.64706, 1559561717.64706, 1559649494.11765, 1559649494.11765, 
    1559733300, 1559733300, 1559816485.71429, 1559816485.71429, 
    1559908270.58824, 1559908270.58824, 1559994750, 1559994750, 
    1560075187.5, 1560075187.5, 1560612150, 1560612150, 1560686600, 
    1560686600, 1560744720, 1560744720, 1560897000, 1560897000, 
    1560945494.11765, 1560945494.11765, 1561031258.82353, 1561031258.82353, 
    1561124353.84615, 1561124353.84615, 1561174650, 1561174650, 
    1561397760, 1561397760, 1561469250, 1561469250, 1561509600, 
    1561509600), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    conc_mean = c(2.21485, 4.51665, 1.07492666666667, 3.61554666666667, 
    1.2719875, 3.3012125, 0.765063157894737, 3.71997368421053, 
    0.375805263157895, 1.10004210526316, 0.675033333333333, 1.17912, 
    1.23057222222222, 3.79774444444444, 0.204633333333333, 0.578241666666667, 
    0.54028, 0.23396, 0.702907142857143, 0.971378571428571, 0.813372727272727, 
    1.31120909090909, 0.87175, 1.3416, 1.15376, 3.93216, 0.3061, 
    1.58768333333333, 0.325572727272727, 0.530245454545455, 0.735842857142857, 
    1.18681428571429, 0.489575, 0.8701375, 0.431847058823529, 
    0.618288235294118, 0.572268421052632, 1.00910526315789, 0.475021428571429, 
    1.11840714285714, 0.437810526315789, 0.73228947368421, 0.677941666666667, 
    1.26760833333333, 0.4298875, 0.667275, 0, 0.141375, 0.396471428571429, 
    0.566985714285714, 0.562175, 0.5603625, 0.415214285714286, 
    1.04814285714286, 0.139766666666667, 0.1184, 0.158435714285714, 
    0.493435714285714, 0.738375, 2.1870375, 0.5032125, 1.860325, 
    0, 0, 0.80184, 1.6749, 0.629425, 1.32535, 0.621492857142857, 
    2.09426428571429, 0.521784615384615, 0.8041, 0.0966, 0.02106, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.11013, 
    0.45911, 0.945981818181818, 2.15627272727273, 0.44487, 0.68837, 
    0.8569, 1.47154444444444, 0.40066, 0.88519, 0, 0, 0.278175, 
    0.1233125, 0.199175, 0.108025, 0.1002, 0.1679, 0.157933333333333, 
    0.303033333333333, 0.231433333333333, 0.330433333333333, 
    0.5878, 0.694266666666667, 1.13938333333333, 0.78425, 3.01142142857143, 
    0.8532, 2.96855333333333, 0.905413333333333, 2.63885384615385, 
    0.831161538461539, 0.0564933333333333, 0.110113333333333, 
    0.0251636363636364, 0.0381454545454545, 0.032775, 0.070375, 
    0.171754545454545, 0.179809090909091, 0.868431578947368, 
    0.290926315789474, 1.460875, 0.3505375, 0.515116666666667, 
    0.2017, 0.170970588235294, 0.0566647058823529, 2.00161764705882, 
    0.891194117647059, 2.27995882352941, 1.07888823529412, 0.4599, 
    0.172966666666667, 0.292129411764706, 0.3191, 1.30511111111111, 
    0.858427777777778, 0.90774, 0.82456, 0.538777777777778, 0.221883333333333, 
    0.509583333333333, 0.280516666666667, 0.24795, 0.14805, 0.08165, 
    0.09388125, 0.0355947368421053, 0.0266210526315789, 0.0540666666666667, 
    0.0445833333333333, 0.0329111111111111, 0.0137111111111111, 
    0.431323529411765, 0.138288235294118, 0.946082352941176, 
    0.597052941176471, 0.0175, 0.00785294117647059, 0.03314375, 
    0.019, 0.04485, 0.0101714285714286, 1.12921176470588, 0.166876470588235, 
    2.01030625, 1.2114875, 1.25706875, 0.54935, 0.0532833333333333, 
    0.05245, 0.0311222222222222, 0.00601666666666667, 0, 0, 0, 
    0, 0, 0, 0, 0, 0.168461538461538, 0.0720230769230769, 0, 
    0, 0.46362, 0.17162, 0.347108333333333, 0.1352, 0.255366666666667, 
    0.0637), conc_min = c(0.9481, 1.016, 0, 0, 0, 0, 0.1382, 
    0.4736, 0.1568, 0.2592, 0.1855, 0.1443, 0.3351, 0.4526, 0.0364, 
    0.0148, 0.3338, 0.0614, 0.1845, 0.193, 0.298, 0.2129, 0, 
    0, 0.182, 0.3781, 0.1973, 0.5151, 0, 0, 0.289, 0.0466, 0.076, 
    0.0312, 0.1458, 0.0806, 0.1124, 0.0219, 0.1038, 0.0628, 0, 
    0, 0, 0, 0, 0, 0, 0.0911, 0, 0, 0.3236, 0.0391, 0.0757, 0.0159, 
    0.0289, 0, 0.0117, 0.0052, 0.1448, 0.1749, 0, 0, 0, 0, 0, 
    0, 0.2611, 0.1329, 0.1001, 0.6807, 0.0311, 0.0042, 0.0149, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0.2353, 0.5611, 0.0524, 0.0392, 0.2764, 0.3357, 0, 0, 0, 
    0, 0, 0, 0, 0, 0.1002, 0.1679, 0.0878, 0.2908, 0, 0, 0.4953, 
    0.4679, 0.7842, 0.2518, 1.1497, 0.4129, 1.4005, 0.3776, 1.0426, 
    0.3828, 0.0077, 0.0047, 0, 0, 0.026, 0.0039, 0.0241, 0.0029, 
    0.0522, 0.0555, 0.1238, 0.0305, 0.025, 0.0009, 0.0211, 0.0007, 
    0.035, 0.0093, 0.2304, 0.1012, 0.0358, 0.0139, 0.0711, 0.0259, 
    0.2234, 0.1971, 0.012, 0, 0.0079, 0, 0, 0, 0.0348, 0.0258, 
    0.006, 0, 0.0055, 0, 0.0081, 0, 0.0109, 0, 0.0144, 0, 0.0276, 
    0.0015, 0.0047, 0, 0.007, 0, 0.0114, 0, 0.0062, 0, 0.2045, 
    0.3129, 0, 0, 0.0173, 0.0009, 0.0123, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0.1093, 0.0679, 0.0855, 0.0256, 0.0927, 
    0.0266), conc_max = c(5.2082, 9.4515, 2.6412, 9.5067, 3.374, 
    10.2935, 1.9887, 7.3334, 1.1261, 2.2172, 2.521, 2.9801, 3.3107, 
    7.9089, 0.701, 1.181, 0.9013, 0.7176, 1.3709, 2.6799, 2.4004, 
    2.6443, 1.7978, 3.5656, 2.2826, 9.0001, 0.4704, 3.1122, 1.1959, 
    1.0669, 1.8055, 2.8748, 1.4114, 2.7354, 0.9683, 1.6872, 1.7906, 
    3.068, 1.1533, 3.1572, 1.61, 1.8917, 3.1135, 3.3496, 0.8959, 
    1.6323, 0, 0.1973, 1.1029, 1.7997, 1.0299, 1.3705, 1.7949, 
    5.4322, 0.4341, 0.3075, 0.6009, 1.5614, 1.6237, 6.4092, 1.6444, 
    4.1521, 0, 0, 1.6438, 4.4297, 1.512, 2.4371, 2.0231, 6.2908, 
    1.5731, 2.59, 0.3182, 0.0694, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0.5511, 2.499, 1.596, 3.4777, 1.018, 
    1.5773, 1.8561, 2.5637, 1.0436, 3.3362, 0, 0, 1.3413, 0.3713, 
    0.6086, 0.4185, 0.1002, 0.1679, 0.2585, 0.3129, 0.6006, 0.7198, 
    0.668, 1.1023, 1.8961, 1.2774, 6.3908, 1.2608, 5.7836, 1.9329, 
    4.8889, 2.1084, 0.4252, 0.5633, 0.0532, 0.1212, 0.0488, 0.262, 
    0.3876, 1.006, 3.4004, 1.1248, 4.1029, 1.2065, 2.13, 0.6134, 
    0.7077, 0.2737, 6.3182, 2.0403, 6.3883, 2.3115, 1.4964, 0.5299, 
    1.2378, 0.9909, 4.1648, 2.5412, 4.6703, 2.0224, 2.8942, 0.9106, 
    1.1358, 0.7632, 0.4456, 0.2783, 0.4417, 0.5307, 0.0934, 0.1239, 
    0.2766, 0.2853, 0.1005, 0.1172, 4.3601, 1.3379, 4.5632, 2.9013, 
    0.0615, 0.0915, 0.0648, 0.1201, 0.1214, 0.0886, 7.008, 1.001, 
    4.6935, 5.0903, 7.8913, 1.6407, 0.1217, 0.2257, 0.1106, 0.0603, 
    0, 0, 0, 0, 0, 0, 0, 0, 1.0643, 0.5233, 0, 0, 0.6608, 0.2956, 
    0.8226, 0.3397, 0.3421, 0.1225)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -202L))
like image 702
Jhonathan Avatar asked May 15 '20 02:05

Jhonathan


Video Answer


1 Answers

This is another one that was slightly more complicated than I originally thought, but I think I have a solution that seems to work. At first glance, it seems you could just set data=stat_total[which(stat_total$conc_mean!=0),], which would mean only those values greater than 0 would be plotted... but that doesn't work. The reason is simply that ggplot will still connect the line all the way through via geom_path and draw the ribbon via geom_ribbon, since data exists to the right and left of those 0 values.

The key here is to understand that we want to change and assign the group= aesthetic. This controls connectivity of geoms like lines. It's easily demonstrated via the following:

d <- data.frame(x=1:10, y=1:10, grp=c(rep(1,4),2,rep(3,5)))
ggplot(d, aes(x,y)) + theme_bw() + 
    geom_line(aes(group=grp)) + geom_point()

enter image description here

So the theoretical solution to your example will involve getting a group= aesthetic to apply to "sections" of stat_total$conc_mean that don't equal zero, while also just not plotting when stat_total$conc_mean equals zero. Critically, the "sections" need to have different group= aesthetic values. If they don't, then we'll just get the whole thing connected like you have now, since again--there still exists data to the right and left of those zeros, so ggplot will just draw a line through them.

Solution

First, I arranged your data frame by stat_total$gas and then stat_total$date_mean.

df <- arrange(stat_total, gas, date_mean)

Then, I wanted to

(1) create a column that basically indicated when stat_total$conc_mean was 0 or contained a value > 0. I concede there is probably a more elegant to accomplish the goal here without this step, but this part also makes it easier to follow the solution.

df$a <- ifelse(df$conc_mean==0, NA, 1)

(2) Use a function to create a new grouping column. The function steps through a vector and stores a count number (g_num) into a return vector in that position when there is a number, but stores NA and increments g_num when it finds an NA. The result is a return vector that has the sequence of numbers we want here.

my_func <- function(x) {
  g_num <- 1
  return_vect <- vector(mode='double',length=length(x))
  for(i in 1:length(x)) {
    if (is.na(x[i])){
      return_vect[i] <- NA
      g_num <- g_num+1
    }
    else {
      return_vect[i] <- g_num
    }
  }
  return(return_vect)
}

# create the new column
df$g <- my_func(df$a)

An example of how it works is shown below:

> test <- c(1,1,1,NA,NA,1,1,NA,1,1)
> test
 [1]  1  1  1 NA NA  1  1 NA  1  1
> my_func(test)
 [1]  1  1  1 NA NA  3  3 NA  4  4

(3) Plot it. It's the same as your original code, but we use the new column as the group= aesthetic, and also only plot values > 0 for stat_total$conc_mean (so you avoid getting a line at the bottom of the graph for certain sections.

ggplot(df[which(df$conc_mean!=0),], aes(color=gas, group=g)) + 
  geom_path(aes(x=date_mean, y=conc_mean, color=gas), size=1.2, na.rm = T) +          
  geom_ribbon(aes(x=date_mean, ymin=conc_min, ymax=conc_max, fill=gas), color="grey70", alpha=0.4, na.rm = T)+
  scale_x_datetime(date_breaks = "3 weeks" , date_labels = "%d-%b") + 
  xlab(NULL) +
  ylab('[ppb]') + 
  theme_bw() +
  facet_wrap(gas~.,scales = 'free_x',ncol = 1,nrow=2)

enter image description here

like image 128
chemdork123 Avatar answered Sep 21 '22 05:09

chemdork123