Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

algorithm to find relationship of two twitter users

I have a 6 degrees of Kevin Bacon type problem. Lets say I have 2 twitter users and I want to figure out their relationship to each other through friends (I use friends to denote when you follow someone vs them following you) and followers in twitter. I have all id's in my database.

So for example:

Joel and Sally

Joel follows Fred who is friends with Steve who follows Sally.

There could be multiple ways to get there, but I want the shortest.

This seems like a well known computer science problem (shortest path algorihm).

Today I have a table called "influencers" where all my twitter ids are stored, then I have a followings table that is a self referential table (ids of followers on one side and friends on the other.)

So is this graph theory? If so can someone point me to any utilities/libraries/approaches that can be helpful. Im using ruby, but can parse most languages.

like image 699
Joelio Avatar asked Jun 05 '12 11:06

Joelio


People also ask

Which algorithm is used in Twitter?

Tweets in reverse chronological order The Twitter algorithm identifies the important tweets and displays them first. After that, all others will appear in reverse chronological order to see your followers' most recent and significant updates.

How do you track Twitter interactions?

To access your Tweet activity: On a desktop or laptop computer, visit analytics.twitter.com and click on Tweets. In the Twitter app for iOS or Android, tap the analytics icon visible in your Tweets. Make sure you have installed the latest version of Twitter for iPhone, iPad, or Twitter for Android.


1 Answers

As you have said, it is a well known problem, as you can see in Wikipedia.

Just notice that in your case, the weights in all edges are equal to 1), so I don't think that Djikstra's algorithm would be very useful to you.

In order to find the minimum distance, I would suggest a breadth-first search. The problem is that the Twitter network may be extremelly connected and hence you may have a combinatorial explosion (imagine that each person is connected to 20 other persons - in the first level, you would visit 20 profiles, while in the next you would visit 400, and in the next 8000 - if you don't find Sally fast, you quickly will run out of memory).

There is also a linear programming formulation, with which I am not 100% familiar. These notes are good on linear programming, but not on the shortest path problem, while these seem more focused on the applications.

There is a video lecture on this problem available on line that seem quite complete.

I hope these references help.

like image 195
rlinden Avatar answered Sep 29 '22 17:09

rlinden