Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate correlation amongst preferences?

I have to split a group of x people into 3 or 4 groups, most likely 3.

I want people to be happy, so I'm having each person rate the other members of the big group from 1 to (x-1).

How do I optimize preferences to create 3 groups?

like image 858
wehavinthisbaby Avatar asked Apr 14 '11 19:04

wehavinthisbaby


People also ask

What is the formula for calculating correlation?

Use the formula (zy)i = (yi – ȳ) / s y and calculate a standardized value for each yi. Add the products from the last step together. Divide the sum from the previous step by n – 1, where n is the total number of points in our set of paired data. The result of all of this is the correlation coefficient r.

How do you determine if two sets of data are correlated?

The best way to compare several pairs of data is to use a statistical test — this establishes whether the correlation is really significant. Spearman's Rank correlation coefficient is a technique which can be used to summarize the strength and direction (negative or positive) of a relationship between two variables.


2 Answers

Here is a method that is likely to get a good arrangement, even if it is not an optimal arrangement:

First create a ranking function that can take any pair of groupings and determine whether one is better than the other. Then apply the following algorithm:

  1. Randomly assign people into groups.
  2. Randomly pick one person from each group.
  3. Create new groupings in which each combination of reassignments is performed on the people chosen in step 2. (For 3 groups there will be 6 such reassignments. For 4, 24.)
  4. Of all possible reasignments, pick the best one.
  5. Repeat steps 2–4 one million times.

UPDATE

If there are only 18 people that need to be assigned, then that's just (18 choose 6) * (12 choose 6) / 6 = 2,858,856 possible groupings. (Or, in the case of four groups it's (18 choose 4) * (14 choose 4) * (10 choose 5) / 4 = 192,972,780 groupings.)

You can just try each one and pick the best.

I guess the ranking algorithm itself is really the hard part of this assignment.

You could just give each person a score based on summing the scores of the people selected to be in their group, then sum the scores of each person together.

The problem is that you're going to end up with all the popular people in one group, and all the unpopular people in another group, and all the telephone handset cleaners in another group.

You should just assign people randomly, and then tell them that you used some really scientific system. That way everybody gets a good mix.

like image 140
Jeffrey L Whitledge Avatar answered Sep 21 '22 17:09

Jeffrey L Whitledge


Measure the total satisfaction of a given configuration by calculating the distance between the actual positions and the stated preferences. Start with a randomized set of groups. Then use something like hill climbing or simulated annealing to optimise.

http://en.wikipedia.org/wiki/Hill_climbing

http://en.wikipedia.org/wiki/Simulated_annealing

Simulated annealing sounds complicated, but it's really just a cleverer version of hill-climbing.

like image 41
Ben Avatar answered Sep 21 '22 17:09

Ben