I am working on incorporating a variable that is recorded once per unit to a yearly dataset. While it is quite straightforward to repeat the observations n times, I have trouble assigning years to the observations.
The structure of my data is as follows:
id startyear endyear dummy
1 1946 2005 1
2 1957 2005 1
3 1982 2005 1
4 1973 2005 1
What I want to do is to create a new row, called years, which repeats unit 1 n = 2005 - 1946 = 59 times; unit 2 2005-1957 times, and so forth as well as assigning the year, generating the following output:
id startyear endyear dummy year
1 1946 2005 1 1946
1 1946 2005 1 1947
1 1946 2005 1 1948
1 1946 2005 1 1949
[…]
I have attempted to use slice and mutate in dplyr, in combination with rep and seq but neither gives me the result I want. Any help would be greatly appreciated.
We can use map2 to create a sequence from 'startyear' to 'endyear' for each element into a list and then unnest
library(tidyverse)
df1 %>%
mutate(year = map2(startyear, endyear, `:`)) %>%
unnest
# id startyear endyear dummy year
#1 1 1946 2005 1 1946
#2 1 1946 2005 1 1947
#3 1 1946 2005 1 1948
#4 1 1946 2005 1 1949
#5 1 1946 2005 1 1950
#6 1 1946 2005 1 1951
#7 1 1946 2005 1 1952
#...
Or do a group by 'id', mutate into a list and unnest
df1 %>%
group_by(id) %>%
mutate(year = list(startyear:endyear)) %>%
unnest
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With