Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do social networking websites compute friend updates?

Tags:

Social networking website probably maintain tables for users, friends and events...

How do they use these tables to compute friends events in an efficient and scalable manner?

like image 841
yigal Avatar asked Apr 17 '09 22:04

yigal


People also ask

How many friends does the average person have on social media?

The average (mean) number of friends is 338, and the median (midpoint) number of friends is 200. Half of internet users who do not use Facebook themselves live with someone who does.

What is social networking explain any two social networking sites?

The term social networking refers to the use of internet-based social media sites to stay connected with friends, family, colleagues, or customers. Social networking can have a social purpose, a business purpose, or both, through sites like Facebook, Twitter, Instagram, and Pinterest.

Why social networking is important?

It allows people to share information Social networks also help facilitate the spread of information. This can be information about local or national news, products or services, certain businesses, laws and government actions or simply information about friends and family members.

What is social networking PDF?

A social network is the set of human beings or rather their digital representations that refer to the registered users who are linked by relationships extracted from the data about their activities, common communication or direct links gathered in the internet–based systems.


2 Answers

Many of the social networking sites like Twitter don't use an RDBMS at all but a Message Queue application. A lot of them start out with a already present application like RabbitMQ. Some of them get big enough they have to heavily customize or build their own. Twitter is in the process of doing this for the second time.

A message queue application works by holding messages from one service for one or more other services. For instance say service Frank is publishing messages to a queue foo. Joe and Jill are subscribed to Franks foo queue. the application will keep track of whether or not Joe or Jill have recieved the messages and once every subscriber to the queue has recieved the message it discards it. Frank fires messages and forgets about it. Joe and Jill ask for messages from foo and get whatever messages they haven't gotten yet. Joe and Jill do whatever they need to do with the message. Perhaps keeping it around perhaps not.

The message queue application guarantees that everyone who is supposed to get the message can and will get the message when they request them. The publisher can send the messages confident that subscriber can get them eventually. This has the benefit of being completely asynchronous and not requiring costly joins.

EDIT: I should mention also that usually the storage for these kind of things at high scale are heavily denormalized. So Joe and Jill may be storing a copy of the exact same message. This is considered ok because it helps the application scale to billions of users.

Other reading:

  1. http://www.rabbitmq.com/
  2. http://qpid.apache.org/
like image 104
2 revs, 2 users 88% Avatar answered Oct 23 '22 05:10

2 revs, 2 users 88%


The mainstay data structure of social networking sites is the graph. On facebook the graph is undirected (When you're someone's friend, they're you're friend). On twitter the graph is directed (You follow someone, but they don't necessarily follow you).

The two popular ways to represent graphs are adjacency lists and adjacency matrices.

An adjacency list is simply a list of edges on the graph. Consider a user with an integer userid.

User1, User2   1      2   1      3   2      3 

The undirected interpretation of these records is that user 1 is friends with users 2 and 3 and user 2 is also friends with user 3.

Representing this in a database table is trivial. It is the many to many relationship join table that we are familiar with. SQL queries to find friends of a particular user are quite easy to write.

Now that you know a particular user's friends, you just need to join those results to the updates table. This table contains all the user's updates indexed by user id.

As long as all these tables are properly indexed, you'd have a pretty easy time designing efficient queries to answer the questions you're interested in.

like image 26
Jason Punyon Avatar answered Oct 23 '22 03:10

Jason Punyon