I have searched far and wide for an answer to this problem. I'm using a Microsoft SQL Server, suppose I have a table that looks like this:
+--------+---------+-------------+-------------+ | ID | NUMBER | COUNTRY | LANG | +--------+---------+-------------+-------------+ | 1 | 3968 | UK | English | | 2 | 3968 | Spain | Spanish | | 3 | 3968 | USA | English | | 4 | 1234 | Greece | Greek | | 5 | 1234 | Italy | Italian |
I want to perform one query which only selects the unique 'NUMBER' column (whether is be the first or last row doesn't bother me). So this would give me:
+--------+---------+-------------+-------------+ | ID | NUMBER | COUNTRY | LANG | +--------+---------+-------------+-------------+ | 1 | 3968 | UK | English | | 4 | 1234 | Greece | Greek |
How is this achievable?
Adding the DISTINCT keyword to a SELECT query causes it to return only unique values for the specified column list so that duplicate rows are removed from the result set.
If you want the query to return only unique rows, use the keyword DISTINCT after SELECT . DISTINCT can be used to fetch unique rows from one or more columns. You need to list the columns after the DISTINCT keyword.
Yes, DISTINCT works on all combinations of column values for all columns in the SELECT clause.
Introduction to SQL DISTINCT operator Note that the DISTINCT only removes the duplicate rows from the result set. It doesn't delete duplicate rows in the table. If you want to select two columns and remove duplicates in one column, you should use the GROUP BY clause instead.
A very typical approach to this type of problem is to use row_number()
:
select t.* from (select t.*, row_number() over (partition by number order by id) as seqnum from t ) t where seqnum = 1;
This is more generalizable than using a comparison to the minimum id. For instance, you can get a random row by using order by newid()
. You can select 2 rows by using where seqnum <= 2
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With