Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing a mysql query to fetch "unseen" entries per user

This title is rather mesmerizing but I couldn't come up with something clearer.

Long story short, we're creating a mobile app connected to a node.js server communicating with a mySql database. Pretty common setup. Now, we have multiple users connected that are able to upload "moments" to our servers. These moments can be only seen once by all other users.

As soon as a user x sees another user y's moment, x cannot see this one y's moment, ever. Maybe a bit like Snapchat, except the moment is single user to multiple users instead of single to single. Moments are also ordered by distance according to the current user's location.

Now, I'm looking for an intelligent way of only fetching the "unseen" moments from database. For now, we're using a relational table between Users and Moments.

Let's say a user (ID = 20) sees a moment (ID = 30320), then we insert into this table 20 and 30320. I know. This is hardly scalable and probably a terrible idea.

I thought about maybe checking the last seen date and only fetching moments that are past this date, but again, moments are ordered by distance before being ordered by date so it is possible to see a moment that is 3 minutes old followed by a moment that is 30 seconds old.

Is there a more clever way of doing this, or am I doomed to use a relationship table between Moments and Users, and join to it when querying?

Thanks a lot.

EDIT -

This logic uses in total 3 tables.

  • Users
  • Moments
  • MomentSeen

MomentSeen only contains what user has seen what moment, and when. Since the moments aren't ordered by date, I can't fetch all the moments that were uploaded after the last seen moment.

EDIT -

I just realized the mobile app Tinder must use similar logic for which user "liked" which other user. Since you can't go back in time and see a user twice, they probably use a very similar query as what I'm looking for.

Considering they have a lot of users, and that they're ordered by distance and some other unknown criteria, there must be a more clever way of doing things than a "UserSawUser" relational table.

EDIT

I can't provide the entire database structure so I'll just leave the important tables and some of their fields.

Users { 
    UserID INT UNSIGNED AUTO_INCREMENT PRIMARY KEY
}

Moments {
    MomentID INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    UploaderID INT UNSIGNED, /* FK to UserID */
    TimeUploaded DATE /* usually NOW() while insertion */
} 

MomentSeen {
    /* Both are FK to Users and Moments */
    MomentID INT UNSIGNED,
    UserID INT UNSIGNED
}
like image 816
Érik Desjardins Avatar asked Jul 28 '15 20:07

Érik Desjardins


1 Answers

You can consider implementing bloom filter. It is widely used to reduce disk seeks and drive better performance.

Medium is using it to check if a user has read a post already.

More details here-
https://medium.com/the-story/what-are-bloom-filters-1ec2a50c68ff https://en.wikipedia.org/wiki/Bloom_filter

like image 99
Brainhash Avatar answered Nov 08 '22 13:11

Brainhash