Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database structure and queries for a Twitter-like posts timeline

I'm posting this question with a Django tag, because I'm working in this environment, but it's actually more or a generic question: I'd like to show something like my timeline on Twitter, i.e. a list of posts of all the people I am following ordered chronologically.

My typical database structure looks like this:

Table Name: Users
Columns:
    UserID PK
    EmailAddress
    Password

TableName: Friends
Columns:
    UserID PK FK
    FriendID PK FK

TableName: Posts
Columns:
    UserID PK FK
    Content

If I wanted to retrieve all posts of all the friends for a given user, it looks like this (this is not supposed to be valid SQL!):

SELECT * FROM Posts WHERE UserID IN (LIST OF "Given user friends' IDs")

This works - no problem! However, this does so not scale! Let's assume we have a busy website and our given user has 2,000 friends and there are some ten million posts in the database. In such a scenario, the database query would be highly inefficient and slow.

Can this be solved with a relational database like PostgreSQL or MySQL? If not, how does e.g. Twitter do it?

like image 565
Simon Steinberger Avatar asked Jan 04 '23 19:01

Simon Steinberger


2 Answers

A shorter version:

SELECT
    Posts.*
FROM
    Posts 
    JOIN Friends ON Friends.FriendID = Posts.UserID
WHERE
    Friends.UserID = 1 /* change as needed */ ;

You can check it with:

CREATE TABLE Users
(
    UserID integer PRIMARY KEY,
    EmailAddress text,
    Password text
) ;

CREATE TABLE Friends
(
    UserID integer NOT NULL REFERENCES Users(UserID),
    FriendID integer NOT NULL REFERENCES Users(UserID),
    PRIMARY KEY (UserID, FriendID)
) ;

CREATE TABLE Posts
(
    PostID integer PRIMARY KEY,
    UserID integer NOT NULL REFERENCES Users(UserID),
    Content text
) ;

INSERT INTO Users 
VALUES 
  (1, '[email protected]', 'pass1'),
  (2, '[email protected]', 'pass2'),
  (3, '[email protected]', 'pass3'),
  (4, '[email protected]', 'pass4') ;

INSERT INTO Friends
VALUES 
   (1, 2),
   (1, 4) ;

INSERT INTO Posts
VALUES
    (1, 2, 'A post from User 2'),
    (2, 2, 'Another post from User 2'),
    (3, 3, 'A post from User 3'),
    (4, 4, 'A post from User 4') ;

and you should get:

+---+---+--------------------------+
| 1 | 2 | A post from User 2       |
| 2 | 2 | Another post from User 2 |
| 4 | 4 | A post from User 4       |
+---+---+--------------------------+
like image 194
joanolo Avatar answered Jan 08 '23 10:01

joanolo


   SELECT p.*
     FROM Posts AS p
     JOIN (SELECT FriendID
             FROM Friends
            WHERE UserID = :given_user_id) AS f
       ON (p.UserID = f.FriendID)

scales much better. As you have marked your PK’s, you have all needed indexes in place already.

like image 45
Dario Avatar answered Jan 08 '23 11:01

Dario