I have a complex query:
SELECT DISTINCT ON (delivery.id)
delivery.id, dl_processing.pid
FROM mailer.mailer_message_recipient_rel AS delivery
JOIN mailer.mailer_message AS message ON delivery.message_id = message.id
JOIN mailer.mailer_message_recipient_rel_log AS dl_processing ON dl_processing.rel_id = delivery.id AND dl_processing.status = 1000
-- LEFT JOIN mailer.mailer_recipient AS r ON delivery.email = r.email
JOIN mailer.mailer_mailing AS mailing ON message.mailing_id = mailing.id
WHERE
NOT EXISTS (SELECT dl_finished.id FROM mailer.mailer_message_recipient_rel_log AS dl_finished WHERE dl_finished.rel_id = delivery.id AND dl_finished.status <> 1000) AND
dl_processing.date <= NOW() - (36000 * INTERVAL '1 second') AND
NOT EXISTS (SELECT ml.id FROM mailer.mailer_message_log AS ml WHERE ml.message_id = message.id) AND
-- (r.times_bounced < 5 OR r.times_bounced IS NULL) AND
NOT EXISTS (SELECT ur.id FROM mailer.mailer_unsubscribed_recipient AS ur WHERE ur.email = delivery.email AND ur.list_id = mailing.list_id)
ORDER BY delivery.id, dl_processing.id DESC
LIMIT 1000;
It is running very slowly, and the reason seems to be that Postgres is consistently avoiding using merge joins in its query plan despite me having all the indices that I would need for this. It looks really depressing:
http://explain.depesz.com/s/tVY
http://i.stack.imgur.com/Myw4R.png
Why would this happen? How do I troubleshoot such an issue?
UPD: with @wildplasser's help I have reworked the query to fix performance (while changing its semantics somewhat):
SELECT delivery.id, dl_processing.pid
FROM mailer.mailer_message_recipient_rel AS delivery
JOIN mailer.mailer_message AS message ON delivery.message_id = message.id
JOIN mailer.mailer_message_recipient_rel_log AS dl_processing ON dl_processing.rel_id = delivery.id AND dl_processing.status in (1000, 2, 5) AND dl_processing.date <= NOW() - (36000 * INTERVAL '1 second')
LEFT JOIN mailer.mailer_recipient AS r ON delivery.email = r.email
WHERE
(r.times_bounced < 5 OR r.times_bounced IS NULL) AND
NOT EXISTS (SELECT dl_other.id FROM mailer.mailer_message_recipient_rel_log AS dl_other WHERE dl_other.rel_id = delivery.id AND dl_other.id > dl_processing.id) AND
NOT EXISTS (SELECT ml.id FROM mailer.mailer_message_log AS ml WHERE ml.message_id = message.id) AND
NOT EXISTS (SELECT ur.id FROM mailer.mailer_unsubscribed_recipient AS ur JOIN mailer.mailer_mailing AS mailing ON message.mailing_id = mailing.id WHERE ur.email = delivery.email AND ur.list_id = mailing.list_id)
ORDER BY delivery.id
LIMIT 1000
It now runs well, but the query plan still sports these horrible nested loop joins <_<:
http://explain.depesz.com/s/MTo3
I would still like to know why that is.
The reason is that Postgres is actually doing the right thing, and I suck at math. Suppose table A has N rows, and table B has M rows, and they are being joined via a column that they both have a B-tree index for. Then the following is true:
ORDER
clause, as we'll see it's not a bad deal at all.So basically despite being associated with the merge sort, which we all love, merge join almost always sucks.
The reason why my first query was so slow was because it had to perform sort before applying limit, and it was also bad in many other ways. After applying @wildplasser's suggestions, I managed to reduce the number of (still expensive) nested loops and also allow for limit to be taken without a sort, thus ensuring that Postgres most likely won't need to run the outer scan to its completition, which is where I derive the bulk of performance gains from.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With