Suppose I have a table of customers and a table of purchases. Each purchase belongs to one customer. I want to get a list of all customers along with their last purchase in one SELECT
statement. What is the best practice? Any advice on building indexes?
Please use these table/column names in your answer:
id
, name
id
, customer_id
, item_id
, date
And in more complicated situations, would it be (performance-wise) beneficial to denormalize the database by putting the last purchase into the customer table?
If the (purchase) id
is guaranteed to be sorted by date, can the statements be simplified by using something like LIMIT 1
?
SQL LEFT JOIN examples Each location belongs to one and only one country while each country can have zero or more locations. The relationship between the countries and locations tables is one-to-many.
There are four main types of JOINs in SQL: INNER JOIN, OUTER JOIN, CROSS JOIN, and SELF JOIN.
To get the last record, the following is the query. mysql> select *from getLastRecord ORDER BY id DESC LIMIT 1; The following is the output. The above output shows that we have fetched the last record, with Id 4 and Name Carol.
1 Answer. The order doesn't matter for INNER joins. As long as you change your selects from SELECT * to SELECT a.
This is an example of the greatest-n-per-group
problem that has appeared regularly on StackOverflow.
Here's how I usually recommend solving it:
SELECT c.*, p1.* FROM customer c JOIN purchase p1 ON (c.id = p1.customer_id) LEFT OUTER JOIN purchase p2 ON (c.id = p2.customer_id AND (p1.date < p2.date OR (p1.date = p2.date AND p1.id < p2.id))) WHERE p2.id IS NULL;
Explanation: given a row p1
, there should be no row p2
with the same customer and a later date (or in the case of ties, a later id
). When we find that to be true, then p1
is the most recent purchase for that customer.
Regarding indexes, I'd create a compound index in purchase
over the columns (customer_id
, date
, id
). That may allow the outer join to be done using a covering index. Be sure to test on your platform, because optimization is implementation-dependent. Use the features of your RDBMS to analyze the optimization plan. E.g. EXPLAIN
on MySQL.
Some people use subqueries instead of the solution I show above, but I find my solution makes it easier to resolve ties.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With