Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In R: Add rows with annual date intervals to data frame

Tags:

date

dataframe

r

I would like to add rows for each sample point containing the from-to time steps with annual intervals. So in the added rows I want to change only the content of the "from" and "to" columns and maintain all the other information from the row above.

What I have right now:

 > sample_points
point       from         to     label
    1 2004-05-01 2007-05-01  cropland
    2 2009-05-01 2012-05-01 grassland
    3 2014-05-01 2016-05-01    forest

What I need:

 > sample_points
point       from         to     label
    1 2004-05-01 2005-05-01  cropland
    1 2005-05-01 2006-05-01  cropland
    1 2006-05-01 2007-05-01  cropland
    2 2009-05-01 2010-05-01 grassland
    2 2010-05-01 2011-05-01 grassland
    2 2011-05-01 2012-05-01 grassland
    3 2014-05-01 2015-05-01    forest
    3 2015-05-01 2016-05-01    forest  

Here the example data frame:

point <- c("1", "2", "3")
from <- as.Date(c("2004-05-01", "2009-05-01", "2014-05-01"))
to <- as.Date(c("2007-05-01", "2012-05-01", "2016-05-01"))
label <- c("cropland", "grassland", "forest")

sample_points <- data.frame(point, from, to, label)

I am new to R and this is my first question here, so please forgive me if the question is not formulated perfectly, something is missing or I've missed a similar question with a solution for my problem. I'm thankful for any hints!

like image 892
Siska Avatar asked Nov 28 '25 11:11

Siska


2 Answers

Here is one tidyverse option :

We create a yearly sequence from the from column to to column and create the to column which is next value of from value for each point.

library(tidyverse)

sample_points %>%
  mutate(from = map2(from, to, seq, by = 'year')) %>%
  unnest(from) %>%
  group_by(point) %>%
  mutate(to = lead(from)) %>%
  filter(!is.na(to))

#  point from       to         label    
#  <chr> <date>     <date>     <chr>    
#1 1     2004-05-01 2005-05-01 cropland 
#2 1     2005-05-01 2006-05-01 cropland 
#3 1     2006-05-01 2007-05-01 cropland 
#4 2     2009-05-01 2010-05-01 grassland
#5 2     2010-05-01 2011-05-01 grassland
#6 2     2011-05-01 2012-05-01 grassland
#7 3     2014-05-01 2015-05-01 forest   
#8 3     2015-05-01 2016-05-01 forest   
like image 73
Ronak Shah Avatar answered Nov 30 '25 00:11

Ronak Shah


You could make yearly sequences by row, repeat every value twice to create a matrix where you delete the unnecessary rows afterwards.

res <- do.call(rbind, lapply(1:nrow(sample_points), function(m) {
  cc <- c("from", "to")
  dc <- as.character(do.call(seq, as.list(c(sample_points[m, cc], by="year"))))
  if (length(dc) == 2) {
    o <- sample_points[m, ]
  } else {
    dm <- suppressWarnings(matrix(rep(dc, each=2)[-1],,2,b=T))
    dm <- if (nrow(dm) == 1) dm else dm[-nrow(dm), ]
    o <- setNames(data.frame(sample_points[m, "point"], dm, 
                             sample_points[m, "label"]),names(sample_points))
    o[cc] <- lapply(o[cc], as.Date)
  }
  o
}))

Gives

res
#   point       from         to     label
# 1     1 2004-05-01 2005-05-01  cropland
# 2     1 2005-05-01 2006-05-01  cropland
# 3     1 2006-05-01 2007-05-01  cropland
# 4     2 2009-05-01 2010-05-01 grassland
# 5     2 2010-05-01 2011-05-01 grassland
# 6     2 2011-05-01 2012-05-01 grassland
# 7     3 2014-05-01 2015-05-01    forest
# 8     3 2015-05-01 2016-05-01    forest

Where

str(res)
# 'data.frame': 8 obs. of  4 variables:
# $ point: chr  "1" "1" "1" "2" ...
# $ from : Date, format: "2004-05-01" ...
# $ to   : Date, format: "2005-05-01" ...
# $ label: chr  "cropland" "cropland" "cropland" "grassland" ...

Data:

sample_points <- structure(list(point = c("1", "2", "3"), from = structure(c(12539, 
14365, 16191), class = "Date"), to = structure(c(13634, 15461, 
16922), class = "Date"), label = c("cropland", "grassland", "forest"
)), class = "data.frame", row.names = c(NA, -3L))
like image 37
jay.sf Avatar answered Nov 30 '25 01:11

jay.sf