Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate the # of unique player (when repeat entry is allowed)?

Tags:

r

I am trying to calculate the number of unique player in an experiment where each player is allowed to re-enter the game. Here is what the data look like

x <- read.table(header=T, text="group timepast Name NoOfUniquePlayer
1 0.02703 A 1
1 0.02827 B 2
1 0.02874 A 2
1 0.02875 A 2
1 0.02875 D 3
2 0.03255 M 1
2 0.03417 K 2
2 0.10029 T 3
2 0.10394 T 3
2 0.10605 K 3
2 0.16522 T 3
3 0.11938 E 1
3 0.12607 F 2
3 0.13858 E 2
3 0.16084 G 3
3 0.19830 G 3
3 0.24563 V 4")

The original experiment data contain the first 3 columns, the first one is the group number of each experiment (3 groups here), the second column is the normalized time each player joined the experiment (I've sort this column from smallest to largest), the third one is the name of each player (each player only join one single group).

What I want to generate is the last column called # of unique players, e.g. for group 1, five players (A B A A D) are recorded but only 3 unique players there (A B D), player A started the game (1st row) and re-joined (3rd row) after player B played (2nd row), and then player A joined the game again (the 4th row thereby was recorded), finally player D entered and finished the whole game.

Can anyone help me figure out how to program in R to get this problem solved?

like image 995
user001 Avatar asked Jan 22 '26 21:01

user001


2 Answers

I think this will give you what you want (I think there is an error in your example for group 2)

x$uniquenum <- unlist(
  tapply(
     x$Name,
     x$group,
     function(y) 
       cummax(as.numeric(factor(y,levels=y[!duplicated(y)])))
    )
)

   group timepast Name NoOfUniquePlayer uniquenum
1      1  0.02703    A                1         1
2      1  0.02827    B                2         2
3      1  0.02874    A                2         2
4      1  0.02875    A                2         2
5      1  0.02875    D                3         3
6      2  0.03255    M                1         1
7      2  0.03417    K                2         2
8      2  0.10029    T                3         3
9      2  0.10394    T                3         3
10     2  0.10605    K                4         3
11     2  0.16522    T                4         3
12     3  0.11938    E                1         1
13     3  0.12607    F                2         2
14     3  0.13858    E                2         2
15     3  0.16084    G                3         3
16     3  0.19830    G                3         3
17     3  0.24563    V                4         4
like image 161
thelatemail Avatar answered Jan 24 '26 15:01

thelatemail


slightly more compactly, using data.table

DT <- data.table(x)


DT[, uniqueNum := cummax(match(Name,unique(Name))), by = group]

if you want the total number of unique players then

DT[, totalUnique := max(uniqueNum), by = group] 
like image 21
mnel Avatar answered Jan 24 '26 13:01

mnel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!