Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Count duplicates in one column but condition for duplicate is not only one fixed value

I tried do search other posts but could only find about finding duplicates about one fixed values.

So imagine following table:

 ╔══════════╦═══════╗
 ║ customer ║ color ║
 ╠══════════╬═══════╣
 ║        1 ║ black ║
 ║        1 ║ black ║
 ║        2 ║ red   ║
 ║        2 ║ black ║
 ║        3 ║ red   ║
 ║        3 ║ red   ║
 ║        3 ║ red   ║
 ║        4 ║ black ║
 ║        5 ║ black ║
 ║        5 ║ green ║
 ║        6 ║ purple║
 ╚══════════╩═══════╝

I want to select the "duplicates" means the following customers:

  • with more than one black
  • one black and other red would be also an duplicate
  • no duplicate: customer can have as many reds as he want

What I have so far

Currently what I can select is only about the black duplicates but I can not combine it wiht the condition "one black, no more red".

SELECT customer FROM events WHERE
    color = 'black'
    group by customer
    having count(*) > 1

Maybe I could first count the blacks and than join again with the existing table count the additional blacks and reds?

Desired Output

I would like to have the following result as customers: 1,2. Even better would be an output where I know if customer was a double black or a black + some reds:

╔══════════╦═══════════╦══════════════╗
║ customer ║ blackOnly ║ blackPlusRed ║
╠══════════╬═══════════╬══════════════╣
║        1 ║ yes       ║ no           ║
║        2 ║ no        ║ yes          ║
╚══════════╩═══════════╩══════════════╝

Sorry had to modify my post

  • I added customer 5 and 6 in example table and more colors. So maybe some suggestions does not work anymore :-(. (Just want to edit fast, so If I didn't follow some modifying rules, just tell me)
  • Thanks for the very fast answers already so far
like image 331
timguy Avatar asked Oct 26 '15 08:10

timguy


2 Answers

This query first creates a temporary table containing the count of black and red for each customer, and then queries this table to obtain the blackOnly and blackPlusRed column values, for each customer.

SELECT t.customer,
    CASE WHEN t.black > 1 AND t.red = 0 THEN 'yes' ELSE 'no' END AS blackOnly,
    CASE WHEN t.black > 0 AND t.red > 0 THEN 'yes' ELSE 'no' END AS blackPlusRed
FROM
(
    SELECT *,
        SUM(CASE WHEN color='black' THEN 1 ELSE 0 END) AS black,
        SUM(CASE WHEN color='red' THEN 1 ELSE 0 END) AS red
    FROM events
    GROUP BY customer
) t

If you want to add a new color condition, e.g. only red, then you can add a new CASE statement to the outer query:

CASE WHEN t.red > 1 AND t.black = 0 THEN 'yes' ELSE 'no' END AS redOnly

Here is a demo:

SQLFiddle

like image 62
Tim Biegeleisen Avatar answered Oct 17 '22 12:10

Tim Biegeleisen


You want all customers having 'black' and at least two records. You can do this with conditional aggregation:

select 
  customer,  
  case when count(distinct color) = 1 then 'yes' else 'no' end as blackOnly,
  case when count(distinct color) > 1 then 'yes' else 'no' end as blackPlusRed
from events 
group by customer
having count(*) > 1
and count(case when color = 'black' then 1 end) > 0;

UPDATE: If you allow for other colors, the query changes slightly:

select 
  customer,  
  case when count(case when color = 'red' then 1 end) = 0 then 'yes' else 'no' end as blackOnly,
  case when count(case when color = 'red' then 1 end) > 0 then 'yes' else 'no' end as blackPlusRed
from events 
group by customer
having count(case when color = 'black' then 1 end) > 1
or
(
  count(case when color = 'black' then 1 end) > 0
  and 
  count(case when color = 'red' then 1 end) > 0
);
like image 44
Thorsten Kettner Avatar answered Oct 17 '22 11:10

Thorsten Kettner