Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to group objects

I have the following classes:

class Sport {
    private String sportsName;
    private List<People> peopleWhoPlayThisSport;
    //...
}
class People {
    private String name;
    private long uniqueId;
   // ...
}

My input is a list of sport objects, for simplicity consider the below examples:

sport1 - Football,   <Sam, Dylan>
sport2 - Basketball, <Tyler, John>
sport3 - Baseball,   <Carter, Dylan>
sport4 - Hockey,     <Kane, Michael>
sport5 - Soccer,     <Carter, Frank>

I have to create a List<List<>>, such that the inner list is all the sports which have at least 1 common player (Transitive property applies here). In the above example, the output should be,

<<sport1,sport3,sport5> , <sport2> , <sport4>>

Any suggestions on solving this and/or pseudo-code?

like image 509
user3704576 Avatar asked Aug 31 '15 10:08

user3704576


1 Answers

Sounds like graph problem to me. What I would do is following:

  1. Create a graph (undirected) where people are nodes, so far without edges
  2. I would go through sports and for every sport I would do an edge between people if they play the same sport (for example when processing sport 1 it would create edge between Sam and Dylan, when processing sport 3, it would do edge between Dylan and Carter)
  3. As last step I would take components of the final graph (in your example Sam-Dylan-Carter-Frank, Kane-Michael, Tyler-John) and "apply the sports to them" - that means that for every boy/girl in the component, I would add all the sports he/she does into the "inner" list (I would prefer Set so every sport is there once).

So the graph would grow this way:

  1. Processing Football: Sam-Dylan
  2. Processing Basketball: Sam-Dylan, Tyler-John
  3. Processing Baseball: Sam-Dylan-Carter, Tyler-John
  4. Processing Hockey: Sam-Dylan-Carter, Tyler-John, Kane-Michael
  5. Processing Soccer: Sam-Dylan-Carter-Frank, Tyler-John, Kane-Michael

And "applying sports" :

  1. Sam (Football), Dylan (Football, Baseball), Carter (Baseball, Soccer), Frank (Soccer) => (Football, Baseball, Soccer)
  2. Tyler (Basketball), John (Basketball) => (Basketball)
  3. Kane (Hockey), Michael (Hockey) => (Hockey)

==> (Football, Baseball, Soccer), (Basketball), (Hockey)

Edit: Optionally you may optimize the algorithm that for every component, you will remember what sports are tied to it. In other words, when creating an edge, you will add a sport to the component's collection of sports. Then "apply sports" step won't be no longer needed. One simple rule, when two components get connected, you will union sport collections before adding a new sport. The algorithm then would go:

  1. Processing Football: Sam-Dylan(Football)
  2. Processing Basketball: Sam-Dylan(Football), Tyler-John(Basketball)
  3. Processing Baseball: Sam-Dylan-Carter(Football, Baseball), Tyler-John(Basketball)
  4. Processing Hockey: Sam-Dylan-Carter(Football,Baseball), Tyler-John(Basketball), Kane-Michael(Hockey)
  5. Processing Soccer: Sam-Dylan-Carter-Frank(Football,Baseball,Soccer), Tyler-John(Basketball), Kane-Michael(Hockey)

Please note that usage of graph is not neccessary. You can still go away with simple collections but the graph seemed as the cleanest way and algorithm-optimal way to do so. It also allows further extensibility because it models the data in natural way - for example you can further find out why Sam is in group with Carter (because their common friend Dylan plays a different sport with both of them).

like image 102
zdenda.online Avatar answered Oct 07 '22 07:10

zdenda.online