I have a situation which I think can be compared to services like CamelCamelCamel, Keepa and so on.
Lets say I track the price of an article on each day for a couple of countries. So my table, lets call it Trend, would look something like this
Id Created ArticleId Country Price
-------------------------------------------------
01 19/11/05 452 US 45.90
02 19/11/05 452 CA 52.99
03 19/11/05 452 MX 99.99
04 19/11/06 452 US 20.00
05 19/11/06 452 CA 25.00
06 19/11/06 452 MX 50.00
...
97 19/11/05 738 US 12.99
98 19/11/05 738 CA 17.50
99 19/11/05 738 MX 45.50
So it's the next day and I want to update the Trend table. If the price in a country is still the same, I skip the article/country combination. If there is a new price I'll add a new record.
Now I want to query the table to get each ArticleId / Country combination. But only the last record of it (orderd by timestamp). So taken the example above I'd expect to get the records 04, 05 and 06 for ArticleId 452. Not 01, 02 and 03
So I start out with this basic query. But how do I get to change it to get my expected results?
SELECT
*
FROM
Trend
ORDER BY
Created DESC
One method uses a correlated subquery for filtering:
select t.*
from trend t
where t.created = (
select max(t1.created)
from trend t1
where t1.articleId = t.articleId and t1.country = t.country
)
For performance, you want an index on (articleId, country, created).
You might also want to consider the anti-left join approach:
select t.*
from trend t
left join trend t1
on t1.articleId = t.articleId
and t1.country = t.country
and t1.created > t.created
where t1.articleId is null
Finally, another typical solution is to join the table with an aggregate query:
select t.*
from trend t
inner join (
select articleId, country, max(created) created
from trend
group by articleId, country
) t1
on t1.articleId = t.articleId
and t1.country = t.country
and t1.created = t.created
Which solution performs better depends on the size and distribution of your data.
You can do this with a combination of DISTINCT and CROSS APPLY.
SELECT DISTINCT ca.Id, ca.Created, t.ArticleId, t.Country, ca.Price
FROM Trend t
CROSS APPLY (SELECT TOP 1 Id, Created, Price
FROM Trend
WHERE ArticleId = t.ArticleId AND Country = t.Country
ORDER BY Created DESC) ca
Often times when writing queries using APPLY your joining columns (ArticleId and Country) make up a unique key on another table. If this applies to your database you can drop the DISTINCT and speed up the query.
SELECT ca.Id, ca.Created, a.ArticleId, a.Country, ca.Price
FROM Article a
CROSS APPLY (SELECT TOP 1 Id, Created, Price
FROM Trend
WHERE ArticleId = a.ArticleId AND Country = a.Country
ORDER BY Created DESC) ca
Lastly, if you're having issues with performance, you may want to create an index.
CREATE NONCLUSTERED INDEX [NC_Trend_ArticleId] ON [Trend]
(
[ArticleId] ASC,
[Country] ASC,
[Created] ASC
)
INCLUDE ([Price])
Presumably Id is a PRIMARY KEY and is already covered by a CLUSTERED INDEX, if so, the above should be appropriate for most solutions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With