There are microblogging posts, and votes/emoticons associated with them, both in MySQL innoDB tables. There is a requirement for two types of pages:
(A) Listing page containing many microblogs along with their votes counts/emoticons counts on single page ( say 25 ).
E.g.
THE GREAT FUNNY POST
Not so funny content in a meant to be funny post. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus euismod consequat pellentesque. .....READ MORE....
(3) likes, (5) bored, (7) smiled
. + 24 More posts on same page.
(B) Permalink page containing a single microblog with detailed vote+vote counts/ emoticons.
THE GREAT FUNNY POST
Not so funny content in a meant to be funny post. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus euismod consequat pellentesque. Quisque viverra adipiscing auctor. Mauris ut diam risus, in fermentum elit. Aliquam urna lectus, egestas sit amet cursus et, auctor ut elit. Nulla tempus suscipit nisi, nec condimentum dui fermentum non. In eget lacus mi, ut placerat nisi.
(You, Derp and 1 more like this), (5) bored, (7) smiled
1st approach:
Table#1:
post_id | post_content | post_title | creation_time
Table#2 for storing votes, likes, emoticons:
action_id | post_id | action_type | action_creator | creation_time
To display a page of posts, or a single post. First table is queried to get the posts,
second is queried to get all the actions related to the posts. Whenever a vote etc is done, an insert is made into the post_actions
table.
2nd approach:
Table#1:
post_id | post_content | post_title | creation_time | action_data
Where action_data
can be something like { "likes" : 3,"smiles":4 ...}
Table#2:
action_id | post_id | action_type | action_creator | creation_time
To display a page of posts, only first table is queried to get the posts & action data,
to display individual post with detailed actions, second table is queried to get all the actions related to the posts. Whenever a vote etc is done, an insert is made into the post_actions
table, and action_data
field of table#1 is updated to store updated count.
Assuming there are 100K posts, and 10x actions I.e. 1 million or more actions created. Does approach#2 provide a benefit? Any downsides of it apart from having to read, modify and update JSON information? Is there anyway in which approach#2 can be followed and further improved?
Adding more information based on feedback:
I would recommend either storing all likes/votes data (aggregated and atomic) inside of the table 1 and discard table 2 completely OR to use 2 tables without aggregated data while relying on a JOIN
syntax, clever queries and good indexes.
Why? Because else you will be querying and writing into both tables all the time when a comment/vote/like is made. Assuming 10 actions per post that are merely for displaying interaction, I'd really store it all into 1 table, maybe making an extra column for each kind of action. You can use JSON or simply serialize()
on the arrays, which should be a bit faster.
Which solution you pick in the end will be highly dependant on how many actions you get and how you want to use them. Getting all actions for 1 post is easy with solution 1 and very fast but searching inside would be a mess. On the other hand, solution 2 takes more space, careful query-writing and indexes.
Assuming there are much more reads from the system than writes I can think few ways to do this. You can take advantage of the fact that social networking sites really don't need to have consistent data, only eventually consistent as long as every user sees his/her actions consistently.
Option #1.
Add column for each action type in Table#1 and increment them every time new action happens. In this way the main page listing is very fast.
Table#1
post_id | post_content | post_title | creation_time | action1_count | action2_count | action3_count | ...
What is cool in this approach is that when viewing permalink you don't need to query all actions for post from table#2. Just query last 5 any actions and all actions made by the viewer. Check inspiration here: How to get the latest 2 items per category in one select (with mysql)
Option #2.
This is like your first approach, but write action counts in the Redis hashset or simply as JSON object to memcache. It's lighting fast to query those on main page load. Drawback is that if redis (and always when memcached) is restarted you need to re-initialize them, or just do it when somebody views page from permalink view.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With