I have a list of multiple dataframes, each of which comprises a string of dates and, for each date, either +1 to indicate an increase or -1 for a decrease.
Here’s an example
security1 <- data.frame(
date = seq(from =as.Date('2019-01-01'), to = as.Date('2019-01-10'), by = 'day'),
direction = c(1, 1, 1, -1, -1, 1, 1, 1, 1, -1))
security2 <- data.frame(
date = seq(from =as.Date('2019-01-01'), to = as.Date('2019-01-10'), by = 'day'),
direction = c(1, -1, 1, -1, -1, 1, 1,- 1, 1, -1))
clcn <- list(Sec1 = security1, Sec2 = security2)
For each dataframe, I am trying to find the length of the most recent string of changes and last time the number was bigger than this. It may be that the current streak is just 1 day if the previous day’s move was in the other direction.
I’ve searched for several days for an answer to this and found the following using sequence and rle for a single dataframe at Compute counting variable in dataframe
sequence(rle(as.character(data$list))$lengths)
But I’m struggling to feed that into lapply or map to get it to iterate over the list.
I don’t mind the exact output, but ideally it would include: Dataframe name, current streak, previous streak that’s longer, and date that streak ended. But at the most basic, just getting the sequence number added as a new column on the dataframe would be a huge help, and I can (try to) take it from there.
@akrun has the right idea, but since you said added to the data.frame, perhaps:
library(tidyverse)
clcn %>%
map(~ mutate(., streak = sequence(rle(direction)$lengths)))
$`Sec1`
date direction streak
1 2019-01-01 1 1
2 2019-01-02 1 2
3 2019-01-03 1 3
4 2019-01-04 -1 1
5 2019-01-05 -1 2
6 2019-01-06 1 1
7 2019-01-07 1 2
8 2019-01-08 1 3
9 2019-01-09 1 4
10 2019-01-10 -1 1
$Sec2
date direction streak
1 2019-01-01 1 1
2 2019-01-02 -1 1
3 2019-01-03 1 1
4 2019-01-04 -1 1
5 2019-01-05 -1 2
6 2019-01-06 1 1
7 2019-01-07 1 2
8 2019-01-08 -1 1
9 2019-01-09 1 1
10 2019-01-10 -1 1
From there, you could do more mutate
calls / additions, such as:
clcn %>%
map(
~ mutate(
.,
streak = sequence(rle(direction)$lengths),
max_streak = streak == max(streak)
)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With