Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the recommended way to build functionality similar to Stackoverflow's "Inbox"?

I have an asp.net-mvc website and people manage a list of projects. Based on some algorithm, I can tell if a project is out of date. When a user logs in, i want it to show the number of stale projects (similar to when i see a number of updates in the inbox).

The algorithm to calculate stale projects is kind of slow so if everytime a user logs in, i have to:

  1. Run a query for all project where they are the owner
  2. Run the IsStale() algorithm
  3. Display the count where IsStale = true

My guess is that will be real slow. Also, on everything project write, i would have to recalculate the above to see if changed.

Another idea i had was to create a table and run a job everything minutes to calculate stale projects and store the latest count in this metrics table. Then just query that when users log in. The issue there is I still have to keep that table in sync and if it only recalcs once every minute, if people update projects, it won't change the value until after a minute.

Any idea for a fast, scalable way to support this inbox concept to alert users of number of items to review ??

like image 266
leora Avatar asked Mar 12 '12 04:03

leora


People also ask

What technologies does Stack Overflow use?

Technology. Stack Overflow is written in C# using the ASP.NET MVC (Model–View–Controller) framework, and Microsoft SQL Server for the database and the Dapper object-relational mapper used for data access.

Is Stack Exchange the same as Stack Overflow?

As of September 2015, "Stack Exchange" no longer refers to the company, only the network of question-and-answer websites. Instead, the company is now referred to as Stack Overflow. In 2016, Stack Exchange added a variety of new sites which pushed the boundaries of the typical question-and-answer site.

Why should I use Stack Overflow?

Stack Overflow is a question and answer site for professional and enthusiast programmers. It's built and run by you as part of the Stack Exchange network of Q&A sites. With your help, we're working together to build a library of detailed, high-quality answers to every question about programming.


2 Answers

The first step is always proper requirement analysis. Let's assume I'm a Project Manager. I log in to the system and it displays my only project as on time. A developer comes to my office an tells me there is a delay in his activity. I select the developer's activity and change its duration. The system still displays my project as on time, so I happily leave work.

How do you think I would feel if I receive a phone call at 3:00 AM from the client asking me for an explanation of why the project is no longer on time? Obviously, quite surprised, because the system didn't warn me in any way. Why did that happen? Because I had to wait 30 seconds (why not only 1 second?) for the next run of a scheduled job to update the project status.

That just can't be a solution. A warning must be sent immediately to the user, even if it takes 30 seconds to run the IsStale() process. Show the user a loading... image or anything else, but make sure the user has accurate data.

Now, regarding the implementation, nothing can be done to run away from the previous issue: you will have to run that process when something that affects some due date changes. However, what you can do is not unnecessarily run that process. For example, you mentioned that you could run it whenever the user logs in. What if 2 or more users log in and see the same project and don't change anything? It would be unnecessary to run the process twice.

Whatsmore, if you make sure the process is run when the user updates the project, you won't need to run the process at any other time. In conclusion, this schema has the following advantages and disadvantages compared to the "polling" solution:

Advantages

  • No scheduled job
  • No unneeded process runs (this is arguable because you could set a dirty flag on the project and only run it if it is true)
  • No unneeded queries of the dirty value
  • The user will always be informed of the current and real state of the project (which is by far, the most important item to address in any solution provided)

Disadvantages

  • If a user updates a project and then upates it again in a matter of seconds the process would be run twice (in the polling schema the process might not even be run once in that period, depending on the frequency it has been scheduled)
  • The user who updates the project will have to wait for the process to finish

Changing to how you implement the notification system in a similar way to StackOverflow, that's quite a different question. I guess you have a many-to-many relationship with users and projects. The simplest solution would be adding a single attribute to the relationship between those entities (the middle table):

Cardinalities: A user has many projects. A project has many users

That way when you run the process you should update each user's Has_pending_notifications with the new result. For example, if a user updates a project and it is no longer on time then you should set to true all users Has_pending_notifications field so that they're aware of the situation. Similarly, set it to false when the project is on time (I understand you just want to make sure the notifications are displayed when the project is no longer on time).

Taking StackOverflow's example, when a user reads a notification you should set the flag to false. Make sure you don't use timestamps to guess if a user has read a notification: logging in doesn't mean reading notifications.

Finally, if the notification itself is complex enough, you can move it away from the relationship between users and projects and go for something like this:

Cardinalities: A user has many projects. A project has many users. A user has many notifications. A notifications has one user. A project has many notifications. A notification has one project.

I hope something I've said has made sense, or give you some other better idea :)

like image 60
Mosty Mostacho Avatar answered Sep 20 '22 08:09

Mosty Mostacho


You can do as follows:

  1. To each user record add a datetime field sayng the last time the slow computation was done. Call it LastDate.
  2. To each project add a boolean to say if it has to be listed. Call it: Selected
  3. When you run the Slow procedure set you update the Selected fileds
  4. Now when the user logs if LastDate is enough close to now you use the results of the last slow computation and just take all project with Selected true. Otherwise yourun again the slow computation. The above procedure is optimal, becuase it re-compute the slow procedure ONLY IF ACTUALLY NEEDED, while running a procedure at fixed intervals of time...has the risk of wasting time because maybe the user will neber use the result of a computation.
like image 22
Francesco Abbruzzese Avatar answered Sep 24 '22 08:09

Francesco Abbruzzese