Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a UNION query that identifies which table the unique data came from

I have two tables with data. Both tables have a CUSTOMER_ID column (which is numeric). I am trying to get a list of all the unique values for CUSTOMER_ID and know whether or not the CUSTOMER_ID exists in both tables or just one (and which one).

I can easily get a list of the unique CUSTOMER_ID:

SELECT tblOne.CUSTOMER_ID
FROM tblOne.CUSTOMER_ID
UNION
SELECT tblTwo.CUSTOMER_ID
FROM tblTwo.CUSTOMER_ID

I can't do just add an identifier column to the SELECT statemtn (like: SELECT tblOne.CUSTOMER_ID, "Table1" AS DataSource) because then the records wouldn't be unique and it will get both sets of data.

I feel I need to add it somewhere else in this query but am not sure how.

Edit for clarity:

For the union query output I need an additional column that can tell me if the unique value I am seeing exists in: (1) both tables, (2) table one, or (3) table two.

like image 656
thornomad Avatar asked Oct 25 '13 13:10

thornomad


3 Answers

If the CUSTOMER_ID appears in both tables then we'll have to arbitrarily pick which table to call the source. The following query uses "tblOne" as the [SourceTable] in that case:

SELECT
    CUSTOMER_ID,
    MIN(Source) AS SourceTable,
    COUNT(*) AS TableCount
FROM
    (
        SELECT DISTINCT
            CUSTOMER_ID, 
            "tblOne" AS Source
        FROM tblOne
    UNION ALL
        SELECT DISTINCT
            CUSTOMER_ID,
            "tblTwo" AS Source
        FROM tblTwo
    )
GROUP BY CUSTOMER_ID
like image 184
Gord Thompson Avatar answered Sep 22 '22 16:09

Gord Thompson


Gord Thompson's answer is correct. But, it is not necessary to do a distinct in the subqueries. And, you can return a single column with the information you are looking for:

select customer_id,
       iif(min(which) = max(which), min(which), "both") as DataSource
from (select customer_id, "tblone" as which
      from tblOne
      UNION ALL
      select customer_id, "tbltwo" as which
      from tblTwo
     ) t
group by customer_id
like image 29
Gordon Linoff Avatar answered Sep 23 '22 16:09

Gordon Linoff


We could add an identifier column with the integer data type and then do an outer query:

SELECT 
CUSTOMER_ID,
sum(Table)

FROM 
(
SELECT 
  DISTINCT CUSTOMER_ID, 
  1 AS Table
FROM tblOne
UNION
SELECT 
  DISTINCT CUSTOMER_ID,
  2 AS Table
FROM tblTwo
)
GROUP BY CUSTOMER_ID` 

So if the "sum is 1" then it comes from tablOne and if it is 2 then it comes from tableTwo an if it is 3 then it exists in both

If you want to add a 3rd table in the union then give it a value of 4 so that you should have a unique sum for each combination

like image 36
Shulem Neuwirth Avatar answered Sep 21 '22 16:09

Shulem Neuwirth