In R, split a dataframe so subset dataframes contain last row of previous dataframe and first row of subsequent dataframe

Question

There are many answers for how to split a dataframe, for example How to split a data frame?

However, I'd like to split a dataframe so that the smaller dataframes contain the last row of the previous dataframe and the first row of the following dataframe.

Here's an example

n <- 1:9
group <- rep(c("a","b","c"), each = 3)
data.frame(n = n, group)

  n  group
1 1     a
2 2     a
3 3     a
4 4     b
5 5     b
6 6     b
7 7     c
8 8     c
9 9     c

I'd like the output to look like:

 d1 <- data.frame(n = 1:4, group = c(rep("a",3),"b"))
 d2 <- data.frame(n = 3:7, group = c("a",rep("b",3),"c"))
 d3 <- data.frame(n = 6:9, group = c("b",rep("c",3)))
 d <- list(d1, d2, d3)
 d

[[1]]
  n group
1 1     a
2 2     a
3 3     a
4 4     b

[[2]]
  n group
1 3     a
2 4     b
3 5     b
4 6     b
5 7     c

[[3]]
  n group
1 6     b
2 7     c
3 8     c
4 9     c

What is an efficient way to accomplish this task?

G. Grothendieck · Accepted Answer

Suppose DF is the original data.frame, the one with columns n and group. Let n be the number of rows in DF. Now define a function extract which given a sequence of indexes ix enlarges it to include the one prior to the first and after the last and then returns those rows of DF. Now that we have defined extract, split the vector 1, ..., n by group and apply extract to each component of the split.

n <- nrow(DF)
extract <- function(ix) DF[seq(max(1, min(ix) - 1), min(n, max(ix) + 1)), ]
lapply(split(seq_len(n), DF$group), extract)

$a
  n group
1 1     a
2 2     a
3 3     a
4 4     b

$b
  n group
3 3     a
4 4     b
5 5     b
6 6     b
7 7     c

$c
  n group
6 6     b
7 7     c
8 8     c
9 9     c

In R, split a dataframe so subset dataframes contain last row of previous dataframe and first row of subsequent dataframe

Tags:

dataframe

r

subset

Tedward

1 Answers

G. Grothendieck

Recent Activity

Donate For Us

In R, split a dataframe so subset dataframes contain last row of previous dataframe and first row of subsequent dataframe

Tags:

dataframe

r

subset

Tedward

1 Answers

G. Grothendieck

Related questions

Recent Activity

Donate For Us