Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to split a data frame into multiple data frames using a conditional statement in R

I have data that looks like this:

time <- c(1:20)
temp <- c(2,3,4,5,6,2,3,4,5,6,2,3,4,5,6,2,3,4,5,6)
data <- data.frame(time,temp)

this is a very basic representation of my data. If you plot this, you can see easily that there are 4 up-sloping groups of data. I want to split the original data frame in to these 4 "subsets" so that I can run calculations on them, like "mean", "max", "min" and "std". I'd like to use the split() but it will only split based on a factor level. I'd like to be able to feed split a conditional statement, such as split if: diff(data$temp) > -2.

My problem is actually much more complex than this, but is there a function like split that will allow me to create new data frames based on a conditional statement? as apposed to splitting based on factor levels.

Thanks all!

like image 743
user1667477 Avatar asked Oct 01 '22 23:10

user1667477


1 Answers

The trick is to convert your conditional statement into something that can be construed as a factor. In this particular example:

tmp <- c(1,diff(data[[2]]))
#  [1]  1  1  1  1  1 -4  1  1  1  1 -4  1  1  1  1 -4  1  1  1  1
tmp2 <- tmp < 0
# [1] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE
# [13] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
tmp3 <- cumsum(tmp2)
#  [1] 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
split(data, tmp3)
# $`0`
#   time temp
# 1    1    2
# 2    2    3
# 3    3    4
# 4    4    5
# 5    5    6
# 
# $`1`
#    time temp
# 6     6    2
# 7     7    3
# 8     8    4
# 9     9    5
# 10   10    6
# 
# $`2`
#    time temp
# 11   11    2
# 12   12    3
# 13   13    4
# 14   14    5
# 15   15    6
# 
# $`3`
#    time temp
# 16   16    2
# 17   17    3
# 18   18    4
# 19   19    5
# 20   20    6
like image 78
Blue Magister Avatar answered Oct 05 '22 10:10

Blue Magister