I'm using SQL Server 2008 and I got stuck in this vicious circle between DISTINCT
and GROUP BY
I've got the following dummy table myTable
:
ID Street City PostalCode ProjectID Date NameId
1 Bar Street Sunny Beach 666 7 25/08/2013 111
2 Sin Street Ibiza 999 5 12/06/2013 222
3 Bar Street Sunny Beach 666 7 07/08/2013 333
4 Bora Bora Bora Bora 1000 10 17/07/2013 444
5 Sin Street Ibiza 999 5 04/07/2013 555
I want to obtain all records (probably first occurrence) with distinct Addresses(Street, City, PostalCode) and ProjectIDs. For example the result here should be:
ID Street City PostalCode ProjectID Date NameId
1 Bar Street Sunny Beach 666 7 25/08/2013 111
2 Sin Street Ibiza 999 5 12/06/2013 222
4 Bora Bora Bora Bora 1000 10 17/07/2013 444
I've tried with DISTINCT
on all columns, but this won't work, since ID
is unique and always returns all the columns. Also tried Group by Street, City PostalCode ProjectID
, but an error occurred regarding Date
and NameId
.
Column '' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
So how can I get a result, where a subset of columns are distinct?
You want to use the row_number()
function:
select t.ID, t.Street, t.City, t.PostalCode, t.ProjectID, t.Date, t.NameId
from (select t.*,
row_number() over (partition by Street, City, PostalCode, ProjectId
order by id
) as seqnum
from t
) t
where seqnum = 1;
This is a window function that assigns a sequential value to rows with the same values in certain columns (defined by the partition by
clause). The ordering within these rows is determined by the order by
clause. In this case, it starts ordering with the lowest id
in the group, so the outer query just selects the first one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With