How do I perform a DISTINCT operation on a single column after a UNION is performed?
T1
--
ID Value
1 1
2 2
3 3
T2
--
ID Value
1 2
4 4
5 5
I am trying to return the table:
ID Value
1 1
2 2
3 3
4 4
5 5
I tried:
SELECT DISTINCT ID, Value
FROM (SELECT*FROM T1 UNION SELECT*FROM T2) AS T3
This does not seem to work.
UNION DISTINCT operator is used for combining DISTINCT result sets from more than one SELECT statement into one result set. Any duplicate rows from the results of the SELECT statements are eliminated so, here all the entries in the result set are distinct.
Answers. Select Distinct is used to select distinct Combination of Cols , normally used with a JOIN. UNION just Joins and gets distinct rows from two sets which have eaqul number of columns.
The SQL UNION ALL operator does not remove duplicates. If you wish to remove duplicates, try using the UNION operator.
The UNION DISTINCT unites and discards duplicate items from the result sets of two or more input queries. Note: The UNION DISTINCT in BigQuery is equivalent to UNION in SQL.
UNION DISTINCT is the default mode, and it will eliminate duplicate records from the second query. That’s similar to the logic of SELECT DISTINCT or FOR ALL ENTRIES.
The UNION operator is used to combine the result-set of two or more SELECT statements. Every SELECT statement within UNION must have the same number of columns The columns in every SELECT statement must also be in the same order The UNION operator selects only distinct values by default. To allow duplicate values, use UNION ALL:
UNION ALL is faster than UNION, but it does not remove duplicates. Including DISTINCT in a UNION query does not add anything. Show activity on this post. Based on the query you have provided I am assuming the DistinctValue column does not initially give you a set of distinct values so one approach you may try is the following:
The UNION operator is used to combine the result-set of two or more SELECT statements. Every SELECT statement within UNION must have the same number of columns The columns in every SELECT statement must also be in the same order The UNION operator selects only distinct values by default.
Why are you using a sub-query? This will work:
SELECT * FROM T1
UNION
SELECT * FROM T2
UNION
removes duplicates. (UNION ALL
does not)
As far as I can say, there's no "one-column distinct
": distinct
is always applied to a whole record (unless used within an aggregate like count(distinct name)
). The reason for this is, SQL cannot guess which values of Value
to leave for you—and which to drop. That's something you need to define by yourself.
Try using GROUP BY
to ensure ID
is not repeated, and any aggregate (here MIN
, as in your example it was the minimum that survived) to select a particular value of Value
:
SELECT ID, min(Value) FROM (SELECT * FROM T1 UNION ALL SELECT * FROM T2) AS T3
GROUP BY ID
Should be exactly what you need. That is, it's not the same query, and there's no distinct
—but it's a query which would return what's shown in the example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With