Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the highest value in the Matrix to maximize the score

Question:

I would like to find the highest value in the Matrix per teacher and group to maximize the ratio of which group should go with which teacher.

            Teacher A   Teacher B   Teacher C   Teacher D
Group 1     50          40          20           50
Group 2     30          10          40          100
Group 3     80          60          40           20

In the above table. I know how to find out the highest value in rows and columns but I want to find highest in the combination of both Teacher and Group, whereby a Teacher cannot belong to two groups and the Groups cannot to two teachers. Yes, there can be more teachers available than the groups.

So I am looking for the final output as follows:

Solution

Group 1 with Teacher B: 40
Group 2 with Teacher D: 100
Group 3 with Teacher A: 80

My work so far I have tried a couple of ways to solve this using pandas but everything only fetches the highest value of rows and columns OR at best the name of a key that is the highest. I followed the tutorial here but didn't gain much success. Any guidance will be great.

like image 545
colourtheweb Avatar asked Feb 12 '20 09:02

colourtheweb


2 Answers

This looks like an optimization question.

You have 2 ways to approach it (on a theorical point of view).

  1. heuristic:

    Except on pathological use case, we can think that the highest value in the matrix will end in the final result. Here we have 100 for Group2 and Teacher D. We then remove the line for Group 2 and the column for Teacher D and iterate.

    This gives step by step:

    Group 2    Teacher D   100
    Group 3    Teacher A    80
    Group 1    Teacher B    50
    
  2. exhaustive

    The previous method will lead to the correct resul is values have large differences, but could only find a solution close to the maximum if values were too close to each other. The exhaustive method consist in computing the sum of values for every possible combination and keep the highest. It will of course give same result, but would require too much operations for me to show it by hand here...

Python translation

First method is iterative but simple:

# heuristic

dfA = df
result = {}

while (len(dfA) > 0):
    mx = dfA.max()     # find max per teacher
    mmx = pd.Series(mx[mx == mx.max()])  # find absolute max of matrix
    teacher = mmx.index[0]                       # get teacher
    val = mmx.values[0]                          # get value
    group = dfA[dfA[teacher] == val].index[0]    # get group
    result[group] = (teacher, val)               # store the triplet
    dfA = dfA.drop(index = group).drop(columns = teacher) # remove the row and column

dfout = pd.DataFrame(result).T
print(dfout.to_string())

Gives as expected:

                 0    1
Group 2  Teacher D  100
Group 3  Teacher A   80
Group 1  Teacher B   40

Second method is more deterministic but may not be scalable to large datasets:

import itertools

# compute with itertools all the possible permutations of group-teachers
mindex = pd.MultiIndex.from_tuples(itertools.permutations(df.columns, len(df)))

# compute the total value for each permutation
total = pd.DataFrame(data = 0, columns=mindex, index=df.index
                     ).transform(lambda x: pd.Series(
                         [df.loc[x.index[i], x.name[i]]
                          for i in range(len(x))], index=x.index)).sum()

# prepare the resulting dataframe
dfout = pd.DataFrame({'Groups': df.index,
                      'Teachers': total[total == total.max()].index[0]})

# extract the value per group
dfout['val'] = dfout.apply(lambda x: df.loc[x['Groups'], x['Teachers']], axis=1)

print(dfout.to_string())

It gives same value as expected

    Groups   Teachers  val
0  Group 1  Teacher B   40
1  Group 2  Teacher D  100
2  Group 3  Teacher A   80
like image 183
Serge Ballesta Avatar answered Oct 28 '22 19:10

Serge Ballesta


First search for all permutations possible, then take the max for the sum of values, and finally print it. Here is my implementation with dataframes :

import itertools
m = [
    [50, 40, 20, 50],
    [30, 10, 40, 100],
    [80, 60, 40, 20]
]
rows = ['Group 1', 'Group 2', 'Group 3']
cols = ['Teacher A', 'Teacher B', 'Teacher C', 'Teacher D']
df = pd.DataFrame(m, index=rows, columns=cols)

permuts = itertools.permutations(cols, len(rows))

L = []
for p in permuts:
    s = 0
    d = {}
    for i, r in enumerate(rows):
        s += df[p[i]][r]
        d[r] = p[i]
    obj = [s, d]
    L.append(obj)

result = max(L, key=lambda x: x[0])
# [220, {'Group 1': 'Teacher B', 'Group 2': 'Teacher D', 'Group 3': 'Teacher A'}]
# Here 220 is the maximum sum you can have

result_dict = result[1]
# {'Group 1': 'Teacher B', 'Group 2': 'Teacher D', 'Group 3': 'Teacher A'}

for i, v in result_dict.items():
    print("{} with {} : {}".format(i, v, df[v][i]))

# Group 1 with Teacher B : 40
# Group 2 with Teacher D : 100
# Group 3 with Teacher A : 80

Explanations

Here is a little example of how itertools.permutations works. The number 2 is the length of each permutation, and ['a','b','c'] are the elements of the permutation :

import itertools
permuts = itertools.permutations(['a','b','c'],2)
for i in a:
    print(i)

Output : (6 permutations here)

('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')

In our case, we have 3 groups, so we need 3 Teachers out of the 4 available (Teachers A,B,C and D). For example the permutation ('Teacher A', 'Teacher B', 'Teacher C') means Group1=Teacher A, Group2=Teacher B, Group3=Teacher C).

So we will enumerate all the ordered permutation of 3 teachers with permuts = itertools.permutations(cols, len(rows)):

('Teacher A', 'Teacher B', 'Teacher C')
('Teacher A', 'Teacher B', 'Teacher D')
('Teacher A', 'Teacher C', 'Teacher B')
...
('Teacher D', 'Teacher C', 'Teacher A')
('Teacher D', 'Teacher C', 'Teacher B')

So get 24 tuples in our variable permuts

Then we calculate the sum of the values for each permutation, and we get a big list containing these elements :

L = []
for p in permuts:
    s = 0
    d = {}
    for i, r in enumerate(rows):
        s += df[p[i]][r]
        d[r] = p[i]
    obj = [s, d]
    L.append(obj)

Output L:

[
    [100, {'Group 1': 'Teacher A', 'Group 2': 'Teacher B', 'Group 3': 'Teacher C'}]
    [80, {'Group 1': 'Teacher A', 'Group 2': 'Teacher B', 'Group 3': 'Teacher D'}]
...
    [220, {'Group 1': 'Teacher B', 'Group 2': 'Teacher D', 'Group 3': 'Teacher A'}]
]
...

the first number (100, 80 and 220 for example) represents the sum of values for this specific permutation.

Then we select the permutation with the maximum sum, here 220

result = max(L, key=lambda x: x[0])
# [220, {'Group 1': 'Teacher B', 'Group 2': 'Teacher D', 'Group 3': 'Teacher A'}]

And finally we print the permutation with the values from the dataframe with print("{} with {} : {}".format(i, v, df[v][i])). For example df["Teacher B"]["Group 1"] = 40 :

Group 1 with Teacher B : 40
Group 2 with Teacher D : 100
Group 3 with Teacher A : 80
like image 30
Phoenixo Avatar answered Oct 28 '22 19:10

Phoenixo