Suppose I have dataframe like this:
family relationship meanings edu
1 1 A respondent 12
2 1 B respondent's spouse 18
3 1 C A's father 10
4 1 D A's mother 9
5 1 E1 A's first son 15
6 1 F1 E1's spouse 14
7 1 G11 E1's first son 3
8 1 G12 E1's second son 1
9 1 E2 A's second son 13
10 2 A respondent 21
11 2 B respondent's spouse 16
12 2 C A's father 12
13 2 D A's mother 16
14 2 E1 A's first son 18
15 2 F1 E1's spouse 15
16 2 E2 A's second son 17
17 2 E3 A's third son 16
family
indicates family number. relationship
indicates relationships in one family. meanings
indicates the meanings of the second column, relationship
.
I want to calculate the father generation’s maximum education year in one family. We do not need spouse's information.
The expected results are as follows:
family id edu fedu
1 1 A 12 10
2 1 C 10 NA
3 1 E1 15 18
4 1 E2 13 18
5 1 G11 3 15
6 1 G12 1 15
7 2 A 21 16
8 2 C 12 NA
9 2 E1 18 21
10 2 E2 17 21
11 2 E3 16 21
Here is the data:
d = structure(list(family = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2), relationship = c("A", "B", "C", "D", "E1", "F1", "G11", "G12", "E2", "A", "B", "C", "D", "E1", "F1", "E2", "E3"), meanings = c("respondent", "respondent's spouse", "A's father","A's mother", "A's first son", "E1's spouse", "E1's first son","E1's second son", "A's second son", "respondent", "respondent's spouse","A's father", "A's mother", "A's first son", "E1's spouse", "A's second son","A's third son"), edu = c(12, 18, 10, 9, 15, 14, 3, 1, 13, 21,16, 12, 16, 18, 15, 17, 16)), row.names = c(NA, -17L), class = c("tbl_df", "tbl", "data.frame"))
Here is what I tried. I thought it was necessary to create a generation variable. Seeing the sample image in your question, C
and D
are the 1st generation. A
and B
are the 2nd generation. E
and F
are the 3rd generation, and G
is the 4th generation. The first mutate()
with case_when()
created the generation variable. Then, I defined groups by family
and generation
. For each group, I identified max education duration (i.e., max_ed_duration
). Since you said that you do not need spouse's information, I dropped rows that contain mother or spouse in meanings
. Then, I defined groups again using family
. For each family, if generation
is 1, assign NA to fedu
. Otherwise, assign max_ed_duration
value from the previous generation to fedu
. Finally, I arranged the data by family
and relationship
.
library(dplyr)
mutate(mydf, generation = case_when(relationship %in% c("C", "D") ~ 1,
relationship %in% c("A", "B") ~ 2,
grepl(x = relationship, pattern = "^E|F") ~ 3,
grepl(x = relationship, pattern = "^G") ~ 4)) %>%
group_by(family, generation) %>%
mutate(max_ed_duration = max(edu)) %>%
filter(!grepl(x = meanings, pattern = "mother|spouse")) %>%
group_by(family) %>%
mutate(fedu = if_else(generation == 1,
NA_real_,
max_ed_duration[match(x = generation - 1, table = generation)])) %>%
arrange(family, relationship)
# family relationship meanings edu generation max_ed_duration fedu
# <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 1 A respondent 12 2 18 10
# 2 1 C A's father 10 1 10 NA
# 3 1 E1 A's first son 15 3 15 18
# 4 1 E2 A's second son 13 3 15 18
# 5 1 G11 E1's first son 3 4 3 15
# 6 1 G12 E1's second son 1 4 3 15
# 7 2 A respondent 21 2 21 16
# 8 2 C A's father 12 1 16 NA
# 9 2 E1 A's first son 18 3 18 21
#10 2 E2 A's second son 17 3 18 21
#11 2 E3 A's third son 16 3 18 21
DATA
mydf <- structure(list(family = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2,
2, 2, 2, 2, 2), relationship = c("A", "B", "C", "D", "E1", "F1",
"G11", "G12", "E2", "A", "B", "C", "D", "E1", "F1", "E2", "E3"
), meanings = c("respondent", "respondent's spouse", "A's father",
"A's mother", "A's first son", "E1's spouse", "E1's first son",
"E1's second son", "A's second son", "respondent", "respondent's spouse",
"A's father", "A's mother", "A's first son", "E1's spouse", "A's second son",
"A's third son"), edu = c(12, 18, 10, 9, 15, 14, 3, 1, 13, 21,
16, 12, 16, 18, 15, 17, 16)), class = "data.frame", row.names = c(NA,
-17L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With