This is my first question on stackoverflow. I searched for similar questions but I didn't find an answer.
I know that the question in the title isn't clear but I hope you are going to understand what I want as output.
I have a dataframe that looks like this:
ID Name Year
1 1 Anas 2018
2 1 Carl 2018
3 1 Catherine 2018
4 2 Anas 2018
5 2 Carl 2018
6 3 Catherine 2018
7 3 Julien 2018
8 4 Raul 2018
9 4 Ahmed 2018
10 4 Laurence 2018
11 4 Carl 2018
12 5 Anas 2019
13 5 Georges 2019
14 5 Arman 2019
15 6 Anas 2019
16 6 Pietro 2019
17 7 Pietro 2019
18 8 Diego 2019
if the names in the column "Name" have the same ID, it means that are collaborators in a project.
I want to add a column with the number of UNIQUE collaborators per year for each name (by including each name in the count of his collborators)
The output should look like this: (I added the last column to explain how to count-Not needed)
ID Name Year Unique_Coll explication
1 1 Anas 2018 3 (Anas, Carl, Catherine)
2 1 Carl 2018 6 (Carle, Anas, Catherine, Laurence, Ahmed, Raul)
3 1 Catherine 2018 4 (Catherine, Carl, Anas, Julien)
4 2 Anas 2018 3 (Anas, Carl, Catherine)
5 2 Carl 2018 6 (Carle, Anas, Catherine, Laurence, Ahmed, Raul)
6 3 Catherine 2018 4 (Catherine, Carl, Anas, Julien)
7 3 Julien 2018 2 (Julien, Catherine)
8 4 Raul 2018 4 (Raul, Ahmed, Laurence, Carl)
9 4 Ahmed 2018 4 (Ahmed, Raul, Laurence, Carl)
10 4 Laurence 2018 4 (Laurence, Raul, Ahmed, Carl)
11 4 Carl 2018 6 (Carle, Anas, Catherine, Laurence, Ahmed, Raul)
12 5 Anas 2019 4 (Anas, Georges, Arman, Pietro)
13 5 Georges 2019 3 (Georges, Anas, Arman)
14 5 Arman 2019 3 (Arman Anas, Georges)
15 6 Anas 2019 4 (Anas, Georges, Arman, Pietro)
16 6 Pietro 2019 2 (Pietro, Anas)
17 7 Pietro 2019 2 (Pietro, Anas)
18 8 Diego 2019 1 (Diego)
Thank you
You could construct a variable that would be a list of names and count the number of unique names in the following way:
library(dplyr)
df = df %>%
group_by(ID) %>%
mutate(group = list(Name)) %>%
group_by(Year,Name) %>%
mutate(n = n_distinct(unlist(list(group)))) %>%
select(-group)
# A tibble: 18 x 4
# Groups: Year, Name [12]
ID Name Year n
<int> <chr> <int> <int>
1 1 Anas 2018 3
2 1 Carl 2018 6
3 1 Catherine 2018 4
4 2 Anas 2018 3
5 2 Carl 2018 6
6 3 Catherine 2018 4
7 3 Julien 2018 2
8 4 Raul 2018 4
9 4 Ahmed 2018 4
10 4 Laurence 2018 4
11 4 Carl 2018 6
12 5 Anas 2019 4
13 5 Georges 2019 3
14 5 Arman 2019 3
15 6 Anas 2019 4
16 6 Pietro 2019 2
17 7 Pietro 2019 2
18 8 Diego 2019 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With