Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divvying people into rooms by last name?

Tags:

algorithm

I often teach large introductory programming classes (400 - 600 students) and when exam time comes around, we often have to split the class up into different rooms in order to make sure everyone has a seat for the exam.

To keep things logistically simple, I usually break the class apart by last name. For example, I might send students with last names A - H to one room, last name I - L to a second room, M - S to a third room, and T - Z to a fourth room.

The challenge in doing this is that the rooms often have wildly different capacities and it can be hard to find a way to segment the class in a way that causes everyone to fit. For example, suppose that the distribution of last names is (for simplicity) the following:

  • Last name starts with A: 25
  • Last name starts with B: 150
  • Last name starts with C: 200
  • Last name starts with D: 50

Suppose that I have rooms with capacities 350, 50, and 50. A greedy algorithm for finding a room assignment might be to sort the rooms into descending order of capacity, then try to fill in the rooms in that order. This, unfortunately, doesn't always work. For example, in this case, the right option is to put last name A in one room of size 50, last names B - C into the room of size 350, and last name D into another room of size 50. The greedy algorithm would put last names A and B into the 350-person room, then fail to find seats for everyone else.

It's easy to solve this problem by just trying all possible permutations of the room orderings and then running the greedy algorithm on each ordering. This will either find an assignment that works or report that none exists. However, I'm wondering if there is a more efficient way to do this, given that the number of rooms might be between 10 and 20 and checking all permutations might not be feasible.

To summarize, the formal problem statement is the following:

You are given a frequency histogram of the last names of the students in a class, along with a list of rooms and their capacities. Your goal is to divvy up the students by the first letter of their last name so that each room is assigned a contiguous block of letters and does not exceed its capacity.

Is there an efficient algorithm for this, or at least one that is efficient for reasonable room sizes?

EDIT: Many people have asked about the contiguous condition. The rules are

  • Each room should be assigned at most a block of contiguous letters, and
  • No letter should be assigned to two or more rooms.

For example, you could not put A - E, H - N, and P - Z into the same room. You could also not put A - C in one room and B - D in another.

Thanks!

like image 406
templatetypedef Avatar asked May 04 '13 17:05

templatetypedef


2 Answers

It can be solved using some sort of DP solution on [m, 2^n] space, where m is number of letters (26 for english) and n is number of rooms. With m == 26 and n == 20 it will take about 100 MB of space and ~1 sec of time. Below is solution I have just implemented in C# (it will successfully compile on C++ and Java too, just several minor changes will be needed):

int[] GetAssignments(int[] studentsPerLetter, int[] rooms)
{
    int numberOfRooms = rooms.Length;
    int numberOfLetters = studentsPerLetter.Length;
    int roomSets = 1 << numberOfRooms; // 2 ^ (number of rooms)
    int[,] map = new int[numberOfLetters + 1, roomSets];

    for (int i = 0; i <= numberOfLetters; i++)
        for (int j = 0; j < roomSets; j++)
            map[i, j] = -2;

    map[0, 0] = -1; // starting condition

    for (int i = 0; i < numberOfLetters; i++)
        for (int j = 0; j < roomSets; j++)
            if (map[i, j] > -2)
            {
                for (int k = 0; k < numberOfRooms; k++)
                    if ((j & (1 << k)) == 0)
                    {
                        // this room is empty yet.
                        int roomCapacity = rooms[k];
                        int t = i;
                        for (; t < numberOfLetters && roomCapacity >= studentsPerLetter[t]; t++)
                            roomCapacity -= studentsPerLetter[t];

                        // marking next state as good, also specifying index of just occupied room
                        // - it will help to construct solution backwards.
                        map[t, j | (1 << k)] = k;
                    }
            }

    // Constructing solution.
    int[] res = new int[numberOfLetters];
    int lastIndex = numberOfLetters - 1;
    for (int j = 0; j < roomSets; j++)
    {
        int roomMask = j;
        while (map[lastIndex + 1, roomMask] > -1)
        {
            int lastRoom = map[lastIndex + 1, roomMask];
            int roomCapacity = rooms[lastRoom];
            for (; lastIndex >= 0 && roomCapacity >= studentsPerLetter[lastIndex]; lastIndex--)
            {
                res[lastIndex] = lastRoom;
                roomCapacity -= studentsPerLetter[lastIndex];
            }

            roomMask ^= 1 << lastRoom; // Remove last room from set.

            j = roomSets; // Over outer loop.
        }
    }

    return lastIndex > -1 ? null : res;
}

Example from OP question:

int[] studentsPerLetter = { 25, 150, 200, 50 };
int[] rooms = { 350, 50, 50 };
int[] ans = GetAssignments(studentsPerLetter, rooms);

Answer will be:

2
0
0
1

Which indicates index of room for each of the student's last name letter. If assignment is not possible my solution will return null.


[Edit]

After thousands of auto generated tests my friend has found a bug in code which constructs solution backwards. It does not influence main algo, so fixing this bug will be an exercise to the reader.

The test case that reveals the bug is students = [13,75,21,49,3,12,27,7] and rooms = [6,82,89,6,56]. My solution return no answers, but actually there is an answer. Please note that first part of solution works properly, but answer construction part fails.

like image 109
SergeyS Avatar answered Nov 13 '22 12:11

SergeyS


This problem is NP-Complete and thus there is no known polynomial time (aka efficient) solution for this (as long as people cannot prove P = NP). You can reduce an instance of knapsack or bin-packing problem to your problem to prove it is NP-complete.

To solve this you can use 0-1 knapsack problem. Here is how: First pick the biggest classroom size and try to allocate as many group of students you can (using 0-1 knapsack), i.e equal to the size of the room. You are guaranteed not to split a group of student, as this is 0-1 knapsack. Once done, take the next biggest classroom and continue.

(You use any known heuristic to solve 0-1 knapsack problem.)

Here is the reduction -- You need to reduce a general instance of 0-1 knapsack to a specific instance of your problem. So lets take a general instance of 0-1 knapsack. Lets take a sack whose weight is W and you have x_1, x_2, ... x_n groups and their corresponding weights are w_1, w_2, ... w_n.

Now the reduction --- this general instance is reduced to your problem as follows: you have one classroom with seating capacity W. Each x_i (i \in (1,n)) is a group of students whose last alphabet begins with i and their number (aka size of group) is w_i.

Now you can prove if there is a solution of 0-1 knapsack problem, your problem has a solution...and the converse....also if there is no solution for 0-1 knapsack, then your problem have no solution, and vice versa.

Please remember the important thing of reduction -- general instance of a known NP-C problem to a specific instance of your problem.

Hope this helps :)

like image 36
Bill Avatar answered Nov 13 '22 11:11

Bill