I am writing an algorithm to match students with different groups. Each group has a limited number of spots. Each student provides their top 5 choices of groups. The students are then placed into groups in a predetermined order (older students and students with perfect attendance are given higher priority). There is no requirement for groups to be filled entirely, but they cannot be filled passed capacity.
I've looked into similar marriage problems such as the Gale-Shapely stable marriage algorithm, but the problem I am having is that there far fewer groups than students and each group can accept multiple students.
What is the best way to implement such an algorithm to find a solution that has been optimized entirely such that there is no better arrangement of students in groups? In terms of algorithm complexity, I'm placing roughly 600 students into 10-20 groups.
NB The close votes are terribly misplaced. Algorithm choice and design to solve an ambiguous problem is absolutely part of programming.
I think you'll get farther with Minimum Weight Bipartite Matching than Stable Marriage (also called the Hungarian method or algorithm or Maximum Weight Matching, which can give you a min weight matching just by negating the weights).
You are out to match positions with students. So the two node types in the bipartite graph are these.
The simplest statement of the algorithm requires a complete weighed bipartite graph with equal numbers of nodes in each set. You can think of this as a square matrix. The weights are the elements. The rows are students. The columns are positions.
The algorithm will pick a single element from each row/column such that the sum is minimized.
@nava's proposal is basically a greedy version of MWBM that's not optimal. The true Hungarian algorithm will give you an optimal answer.
To handle the fact that you have fewer positions than students is easy. To the "real" positions add as many "dummy" positions as needed. Connect all these to all the students with super high-weight edges. The algorithm will only pick them after all the real positions are matched.
The trick is to pick the edge weights. Let's call the ordinal where a student would be considered for a position O_i for the i'th student. Then let R_ip be the ranking that the same student places on the p'th position. Finally, W_ip is the weight of the edge connecting the i'th student to the p'th position. You'll want something like:
W_ip = A * R_ip + B * O_i
You get to pick A and B to specify the relative importance of the students' preferences and the order they're ranked. It sounds like order is quite important. So in that case you want B to be big enough to completely override students' rankings.
A = 1, B = N^2, where N is the number of students.
Once you get an implementation working, it's actually fun to tweak the parameters to see how many students get what preference, etc. You might want to tweak the parameters a bit to give up a little on the order.
At the time I was working on this (late 90's), the only open source MWBM I could find was an ancient FORTRAN lib. It was O(N^3). It handled 1,000 students (selecting core academic program courses) in a few seconds. I spent a lot of time coding a fancy O(N^2 log N) version that turned out to be about 3x slower for N=1000. It only started "winning" at about 5,000.
These days there are probably better options.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With