I want to add a new column to the data frame below that calculates maximum dry spell length for each month. This is what my data frame looks like:
day month year rr spell spell1
1 1 1981 0 dry 1
2 1 1981 0 dry 1
3 1 1981 0 dry 1
4 1 1981 1.1 dry 0
5 1 1981 0 dry 1
6 1 1981 0 dry 1
7 1 1981 0 dry 1
8 1 1981 0 dry 1
9 1 1981 2.7 dry 0
10 1 1981 0 dry 1
This is the output I need:
month year spell_length
1 1981 3
1 1981 4
1 1981 1
this is what I have done so far:
group_by(df, year, month, spell1) %>%
summarise(spell2 = sum(spell1, na.rm = TRUE))
and this is the result:
year month spell1 spell_length
<int> <int> <dbl> <dbl>
1 1981 1 1 31
2 1981 2 0 0
3 1981 2 1 27
4 1981 3 0 0
5 1981 3 1 25
6 1981 4 0 0
data
df <- read.table(h= T, text="day month year rr spell spell1
1 1 1981 0 dry 1
2 1 1981 0 dry 1
3 1 1981 0 dry 1
4 1 1981 1.1 dry 0
5 1 1981 0 dry 1
6 1 1981 0 dry 1
7 1 1981 0 dry 1
8 1 1981 0 dry 1
9 1 1981 2.7 dry 0
10 1 1981 0 dry 1")
Summarize Function in R Programming. As its name implies, the summarize function reduces a data frame to a summary of just one vector or value. Many times, these summaries are calculated by grouping observations using a factor or categorical variables first.
The function n() returns the number of observations in a current group.
count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) . count() is paired with tally() , a lower-level helper that is equivalent to df %>% summarise(n = n()) .
One option would be to group by 'run-length-id' of 'spell' (rleid
from data.table
- creates a new grouping id when the value changes in that column), filter
out the rows having 'spell1' is 0, get the number of rows with n()
library(dplyr)
library(data.table)
df1 %>%
group_by(year, month, grp = rleid(spell1)) %>%
filter(spell1 ==1) %>%
summarise(spell_length = n()) %>%
ungroup %>%
select(-grp)
# A tibble: 3 x 3
# year month spell_length
# <int> <int> <int>
#1 1981 1 3
#2 1981 1 4
#3 1981 1 1
Or use rle
from base R
rl1 <- rle(df1$spell1)
rl1$lengths[rl1$values > 0]
#[1] 3 4 1
NOTE: This solution also works when the 'spell1' values are different
Using dplyr
we can create groups at every occurrence of 0 using cumsum
and sum the number of spells
in each group.
library(dplyr)
df %>%
group_by(month, year, group = cumsum(spell1 == 0)) %>%
summarise(spell_length = sum(spell1)) %>%
ungroup() %>%
select(-group)
# month year spell_length
# <int> <int> <int>
#1 1 1981 3
#2 1 1981 4
#3 1 1981 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With