For this example, I'll use the data.table
package.
Suppose you have a table of coaches
coaches <- data.table(CoachID=c(1,2,3), CoachName=c("Bob","Sue","John"), NumPlayers=c(2,3,0))
coaches
CoachID CoachName NumPlayers
1: 1 Bob 2
2: 2 Sue 3
3: 3 John 0
and a table of players
players <- data.table(PlayerID=c(1,2,3,4,5,6), PlayerName=c("Abe","Bart","Chad","Dalton","Egor","Frank"))
players
PlayerID PlayerName
1: 1 Abe
2: 2 Bart
3: 3 Chad
4: 4 Dalton
5: 5 Egor
6: 6 Frank
You want to match each coach with a set of players such that
How do you this?
exampleResult <- data.table(CoachID=c(1,1,2,2,2,3), PlayerID=c(3,1,2,5,6,NA))
exampleResult
CoachID PlayerID
1: 1 3
2: 1 1
3: 2 2
4: 2 5
5: 2 6
6: 3 NA
In R we use merge() function to merge two dataframes in R. This function is present inside join() function of dplyr package.
Method 1: Using stack method The cbind() operation is used to stack the columns of the data frame together. Initially, the first two columns of the data frame are combined together using the df[1:2]. This is followed by the application of stack() method applied on the last two columns.
You could sample without replacement from the player IDs, grabbing the total number of players you need:
set.seed(144)
(selections <- sample(players$PlayerID, sum(coaches$NumPlayers)))
# [1] 1 4 3 2 6
Each player will have equal probability of being included in selections
, and the ordering of that vector is random. Therefore you can just assign these players to each coaching slot:
data.frame(CoachID=rep(coaches$CoachID, coaches$NumPlayers),
PlayerID=selections)
# CoachID PlayerID
# 1 1 1
# 2 1 4
# 3 2 3
# 4 2 2
# 5 2 6
If you wanted to have an NA
value for any coaches with no player selections, you could do something like:
rbind(data.frame(CoachID=rep(coaches$CoachID, coaches$NumPlayers),
PlayerID=selections),
data.frame(CoachID=coaches$CoachID[coaches$NumPlayers==0],
PlayerID=rep(NA, sum(coaches$NumPlayers==0))))
# CoachID PlayerID
# 1 1 1
# 2 1 4
# 3 2 3
# 4 2 2
# 5 2 6
# 6 3 NA
Get demand and supply on each side, so to speak:
demand <- with(coaches,rep(CoachID,NumPlayers))
supply <- players$PlayerID
Then I'd do...
randmatch <- function(demand,supply){
n_demand <- length(demand)
n_supply <- length(supply)
n_matches <- min(n_demand,n_supply)
if (n_demand >= n_supply)
data.frame(d=sample(demand,n_matches),s=supply)
else
data.frame(d=demand,s=sample(supply,n_matches))
}
Examples:
set.seed(1)
randmatch(demand,supply) # some players unmatched, OP's example
randmatch(rep(1:3,1:3),1:4) # some coaches unmatched
I'm not sure if this is a case the OP wanted to cover, though.
For the OP's desired output...
m <- randmatch(demand,supply)
merge(m,coaches,by.x="d",by.y="CoachID",all=TRUE)
# d s CoachName NumPlayers
# 1 1 2 Bob 2
# 2 1 6 Bob 2
# 3 2 3 Sue 3
# 4 2 4 Sue 3
# 5 2 1 Sue 3
# 6 3 NA John 0
Similarly...
merge(m,players,by.x="s",by.y="PlayerID",all=TRUE)
# s d PlayerName
# 1 1 2 Abe
# 2 2 1 Bart
# 3 3 2 Chad
# 4 4 2 Dalton
# 5 5 NA Egor
# 6 6 1 Frank
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With