I have a dataset with ID numbers for respondents' friends and bullies.
I'd like to go through all the friendship nominations and all the bully nominations in each row and get a count of the number of people they nominate as both. Any help would be great!
HAVE DATA:
ID friend_1 friend_2 friend_3 bully_1 bully_2
1 4 12 7 12 15
2 8 6 7 18 20
3 9 18 1 2 1
4 15 7 2 7 13
5 1 17 9 17 1
6 9 19 20 14 12
7 19 12 20 9 12
8 7 1 16 2 15
9 1 10 12 1 7
10 7 11 9 11 7
WANT DATA:
ID friend_1 friend_2 friend_3 bully_1 bully_2 num_both
1 4 12 7 12 15 1
2 8 6 7 18 20 0
3 9 18 1 2 1 1
4 15 7 2 7 13 1
5 1 17 9 17 1 2
6 9 19 20 14 12 0
7 19 12 20 9 12 1
8 7 1 16 2 15 0
9 1 10 12 1 7 1
10 7 11 9 11 7 2
Assuming that values are unique within friend/bully groups, a simple approach would be:
apply(df[,-1], 1, function (x) sum(table(x) > 1))
[1] 1 0 1 1 2 0 1 0 1 2
We can use apply row-wise and find out the the number of common friends which are present both in friend and bully columns
df$num_both <- apply(df, 1, function(x)
length(intersect(x[grep("friend", names(df))], x[grep("bully", names(df))])))
# ID friend_1 friend_2 friend_3 bully_1 bully_2 num_both
#1 1 4 12 7 12 15 1
#2 2 8 6 7 18 20 0
#3 3 9 18 1 2 1 1
#4 4 15 7 2 7 13 1
#5 5 1 17 9 17 1 2
#6 6 9 19 20 14 12 0
#7 7 19 12 20 9 12 1
#8 8 7 1 16 2 15 0
#9 9 1 10 12 1 7 1
#10 10 7 11 9 11 7 2
Or if you are not a big fan of apply, you can use sapply with the same logic
friend_cols <- grep("friend", names(df))
bully_cols <- grep("bully", names(df))
sapply(seq_len(nrow(df)), function(i)
length(intersect(df[i, friend_cols, drop = TRUE], df[i, bully_cols, drop = TRUE])))
#[1] 1 0 1 1 2 0 1 0 1 2
EDIT
If there are some NA values and we want to exclude them we can use is.na and sum
apply(df, 1, function(x) sum(!is.na(intersect(x[friend_cols], x[bully_cols]))))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With