Ok, here's a query that I am running right now on a table that has 45,000 records and is 65MB in size... and is just about to get bigger and bigger (so I gotta think of the future performance as well here):
SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL
AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id)
And as you might or might not imagine - it chokes the mysql server to a standstill...
What it does is - it simply pulls the number of new users who signed up, have at least one "completed" payment, tm_completed is not empty (as it is only populated for completed payments), and (the embedded Select) that member has never had a "completed" payment before - meaning he's a new member (just because the system does rebills and whatnot, and this is the only way to sort of differentiate between an existing member who just got rebilled and a new member who got billed for the first time).
Now, is there any possible way to optimize this query to use less resources or something, and to stop taking my mysql resources down on their knees...?
Am I missing any info to clarify this any further? Let me know...
EDIT:
Here are the indexes already on that table:
PRIMARY PRIMARY 46757 payment_id
member_id INDEX 23378 member_id
payer_id INDEX 11689 payer_id
coupon_id INDEX 1 coupon_id
tm_added INDEX 46757 tm_added, product_id
tm_completed INDEX 46757 tm_completed, product_id
Those kinds of IN
subqueries are a bit slow in MySQL. I would rephrase it like this:
SELECT COUNT(1) AS signup_count, SUM(amount) AS signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND NOT EXISTS (
SELECT member_id
FROM payments
WHERE member_id = p.member_id
AND completed = 1
AND tm_completed < '2009-05-01');
The check 'tm_completed IS NOT NULL
' is not necessary as that is implied by your BETWEEN
condition.
Also make sure you have an index on:
(tm_completed, completed)
I had fun putting together this solution which does not require a subquery:
SELECT count(p1.payment_id) as signup_count,
sum(p1.amount) as signup_amount
FROM payments p1
LEFT JOIN payments p2
ON p1.member_id = p2.member_id
AND p2.completed = 1
AND p2.tm_completed < date '2009-05-01'
WHERE p1.completed > 0
AND p1.tm_completed between date '2009-05-01' and date '2009-05-30'
AND p2.member_id IS NULL;
Avoid using IN with a subquery; MySQL does not optimize these well (though there are pending optimizations in 5.4 and 6.0 regarding this (see here). Rewriting this as a join will probably get you a performance boost:
SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
LEFT JOIN (SELECT p2.member_id
FROM payments p2
WHERE p2.completed=1
AND p2.tm_completed < '2009-05-01'
AND p2.tm_completed IS NOT NULL
GROUP BY p2.member_id) foo
ON p.member_id = foo.member_id AND foo.member_id IS NULL
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL
Second, I would have to see your table schema; are you using indexes?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With