How do I best gather messages that have not been read by a given user?
Existing tables
Message table
----------------------------------
id title body sentAt
User table
----------------------------------
id username
Read Messages table
----------------------------------
user_id message_id
I'm thinking something like
select
m.id, m.title, m.sentAt, u.username
from
message m,
[user] u
where
u.id = 1 and -- @userId parameter
m.id not in
(select r.message_id from read_messages r where r.user_id = u.id)
Unfortunately for me I dont understand much looking at the execution plan. /Adam
The way to make a query run faster is to reduce the number of calculations that the software (and therefore hardware) must perform. To do this, you'll need some understanding of how SQL actually makes calculations.
Suggesting an alternative approach:
I was faced with the exact same problem at work earlier. I wasted a good week in trying to figure the best way to do this. I ended up with creating a join table, as you have done, but the table contains only unread messages, instead of keeping track of read messages.
Now, if I would've kept track of all the messages everyone has read, the clutter in the database grows pretty rapidly (users*messages rows), easily leading to thousands of rows of 'dead weight' in even smaller applications. This problem is exaggerated if the lifetime of messages are indefinite - you could be keeping track of message statuses that are many years old.
If keeping track of the inverse, your "unread messages" table contains only a handful of rows, and they diminish for each message that a user reads. Also, getting the amount of unread messages is as simple as "SELECT COUNT(*) FROM unread WHERE user = foo
".
As everything, this is a trade-off. While reading is pretty much as fast as computationally possible, writing is a chore. For each written message, you need to insert an entry to this join table. Additionally, if multiple people can read the same message, you need to insert one row for each recipient. If the recipients are implicit (e.g. only a user group's name is given, or even with the criteria such as "anyone who has access to this thing"), creating new messages becomes even more complicated.
But I feel this is a fair compromise.
YMMV, HTH.
NOT IN is very expensive. Instead, you could do something like:
SELECT
m.id, m.title, m.sentAt
FROM
message m
LEFT JOIN [Read Messages] rm
ON m.message_id = rm.message_id AND rm.user_id = @userID
WHERE
rm.user_id IS NULL
Provided you have the right indexes, this should be much faster.
You're fetching all messages and LEFT JOINing read messages for that user. Then, in the WHERE clause you're asking for that user_id to be NULL for that message, meaning the user has not read it yet.
HP's alternative suggestion is most likely suiting you well. However, if it is not, Id suggest using NOT EXISTS rather than LEFT JOIN if it's possible in your environment.
At least on MS SQL it would give a slightly cheaper query plan, since it would not need the last filtering (user_id IS NULL)
SELECT ...
FROM message m
WHERE NOT EXISTS (
SELECT 1
FROM read_messages rm
WHERE rm.usr_id = ...
AND rm.msg_id = m.msg_id
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With