I have a long sequence of 1s and 0s which represent bird incubation patterns, 1 being bird ON the nest, 0 being OFF.
    > Fake.data<- c(1,1,1,1,1,0,0,1,1,1,1,0,0,0,1,1,1,1,0,1,1,1,1,0,0,1,1,1,1,1,0,0,0,0,1,1,0,1,0)
As an end point I would essentially like a single value for the ratio between each ON period and the consecutive OFF period. So ideally this should be for Fake.data a vector like this
    [1] 0.4  0.75  0.25  0.5  0.8  0.5  1 #(I just typed this out!) 
So far I have split the vector into sections using split()
    > Diff<-diff(Fake.data)
    > SPLIT<-split(Fake.data, cumsum(c(1, Diff > 0 )))
    > SPLIT
Which returns...
    $`1`
    [1] 1 1 1 1 1 0 0
    $`2`
    [1] 1 1 1 1 0 0 0
    $`3`
    [1] 1 1 1 1 0
    $`4`
    [1] 1 1 1 1 0 0
    $`5`
    [1] 1 1 1 1 1 0 0 0 0
    $`6`
    [1] 1 1 0
    $`7`
    [1] 1 0
So I can get the ratio for a single split group using
    > SPLIT$'1'<- ((length(SPLIT$'1'))-(sum(SPLIT$'1')))/sum(SPLIT$'1')
    > SPLIT$'1'
    [1] 0.4
However in my data I have some several thousand of these to do and would like to apply some sort of tapply() or for() loop to calculate this automatically for all and put it into a single vector. I have tried each of these methods with little success as the split() output structure does not seem to fit with these functions?
I create a new vector to receive the for() loop output
    ratio<-rep(as.character(NA),(length(SPLIT)))
Then attempting the for() loop using the code above which work for a single run.
    for(i in SPLIT$'1':'7')
    {ratio[i]<-((length(SPLIT$'[i]'))-(sum(SPLIT$'[i]')))/sum(SPLIT$'[i]')}
What I get is...
[1] "NaN" "NaN" "NaN" "NaN" "NaN" "NaN" NA
Tried many other variations along this theme but now just really stuck!
I think you were very close with your stategy. The sapply function is very happy to work with lists. I would just change the last step to
sapply(SPLIT, function(x) sum(x==0)/sum(x==1))
which returns
   1    2    3    4    5    6    7 
0.40 0.75 0.25 0.50 0.80 0.50 1.00 
with your sample data. No additional packages needed.
Here are two possibiities:
1) Compute the lengths using rle and then in the if statement if the data starts with 0 don't include the first length so we are assured that we are starting out with a 1.  Finally compute the ratios using rollapply from the zoo package:
library(zoo)
lengths <- rle(Fake.data)$lengths
if (Fake.data[1] == 0) lengths <- lengths[-1]
rollapply(lengths, 2, by = 2, function(x) x[2]/x[1])
giving:
[1] 0.40 0.75 0.25 0.50 0.80 0.50 1.00
The if line can be removed if we know that the data always starts with a 1.
2) If we can assume that the series always starts with a 1 and ends in a 0 then this one liner would work:
with( rle(Fake.data), lengths[values == 0] / lengths[values == 1] )
giving the same answer as above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With