Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Design for database

I have 4 tables: users, posts, categories, categories_map

posts has id, text, category_id
categories_map contains user_id and category_id

My goal is to make a queue that the user can preview. Also, the user will be able to skip some posts or edit text in them. If the user skipped a post it will never appear in queue. However, the user is not able to change sequence because cron will be executing a script.

The first approach I think is to create a table that will contain user_id, post_id, text_modified, is_skipped, last_posted. So when the cron job is executed it will leave a timestamp so next time this post won't be grabbed and the user easily can change the text for this post.

The second approach is to create a separate table where a queue will be generated for user user_id, post_id, category_id, text_modified. So the cron job can easily job follow this table and remove the row after it was done. But with this approach if I will have 30 users, with an average of 3 categories that contains 5000 posts each, my table will have 450000 rows already. Yes if it is indexed properly it should be all good. But will it be scalable when I have 100-200 users?

Which approach should I go or is there any other solution?

like image 716
user256968 Avatar asked Mar 19 '17 18:03

user256968


1 Answers

A lot of things depend on your product. We don't know:

  1. How users interact with each other?
  2. Do their actions (skips) need to be persisted, or are we ok, if they lose them above 99.9 percentile.
  3. Are their text modification on the posts, globally visible, or only to them.
  4. Are the users checking posts by category?

Said all these unknowns, I'll take a stab at it:

  • If the answer to question 4 is YES then option #2 seems more sound judging from your PKs.
  • If the answer to question 4 is NO then option #1 seems more sound judging from your PKs.

For database size, I think you're doing a bit of pre-optimization. You should take into account table width. Since your tables are very narrow (only a couple of columns and mainly ints), you shouldn't worry too much about the length of the specific table.

When that becomes a constraint, (which you can benchmark, or wait to see disk space on the specific servers), you can scale up the databases by sharding on the user easily. You basically put different users on different db servers.

  • Note: Question 1 will determine how easy the above would be.

Said all this, keep in mind performance implications:

  1. The lists are going to get really long.
  2. If the users modification affect other users, you are going to have to do quite a bit of fan-out work, to publish the updates to the specific queues.

In that case, you might want to take a look at some distributed cache like Memcached, Redis.

  • Note: Depending on answers to Questions 2 & 3, you might not even need to persist the queues.
like image 195
Ajk_P Avatar answered Sep 30 '22 14:09

Ajk_P