Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In R, split a dataframe so subset dataframes contain last row of previous dataframe and first row of subsequent dataframe

There are many answers for how to split a dataframe, for example How to split a data frame?

However, I'd like to split a dataframe so that the smaller dataframes contain the last row of the previous dataframe and the first row of the following dataframe.

Here's an example

n <- 1:9
group <- rep(c("a","b","c"), each = 3)
data.frame(n = n, group)

  n  group
1 1     a
2 2     a
3 3     a
4 4     b
5 5     b
6 6     b
7 7     c
8 8     c
9 9     c

I'd like the output to look like:

 d1 <- data.frame(n = 1:4, group = c(rep("a",3),"b"))
 d2 <- data.frame(n = 3:7, group = c("a",rep("b",3),"c"))
 d3 <- data.frame(n = 6:9, group = c("b",rep("c",3)))
 d <- list(d1, d2, d3)
 d

[[1]]
  n group
1 1     a
2 2     a
3 3     a
4 4     b

[[2]]
  n group
1 3     a
2 4     b
3 5     b
4 6     b
5 7     c

[[3]]
  n group
1 6     b
2 7     c
3 8     c
4 9     c

What is an efficient way to accomplish this task?

like image 942
Tedward Avatar asked Dec 11 '22 19:12

Tedward


1 Answers

Suppose DF is the original data.frame, the one with columns n and group. Let n be the number of rows in DF. Now define a function extract which given a sequence of indexes ix enlarges it to include the one prior to the first and after the last and then returns those rows of DF. Now that we have defined extract, split the vector 1, ..., n by group and apply extract to each component of the split.

n <- nrow(DF)
extract <- function(ix) DF[seq(max(1, min(ix) - 1), min(n, max(ix) + 1)), ]
lapply(split(seq_len(n), DF$group), extract)

$a
  n group
1 1     a
2 2     a
3 3     a
4 4     b

$b
  n group
3 3     a
4 4     b
5 5     b
6 6     b
7 7     c

$c
  n group
6 6     b
7 7     c
8 8     c
9 9     c
like image 60
G. Grothendieck Avatar answered May 24 '23 06:05

G. Grothendieck