PostgreSQL

Question

I have the following table and some sample records in it:

  id  | attr1_id | attr2_id |      user_id      | rating_id | override_comment
------+----------+----------+-------------------+-----------+------------------
 1    |      188 |      201 | [email protected] |         3 |
 2    |      193 |      201 | [email protected] |         2 |
 3    |      193 |      201 | [email protected] |         1 |
 4    |      194 |      201 | [email protected] |         1 |
 5    |      194 |      201 | [email protected] |         1 |
 6    |      192 |      201 | [email protected] |         1 |

The combination of (attr1_id, attr2_id, user_id) is UNIQUE, meaning each user can only create one record with a specific pair of attribute ids.

My goal is to count the number of rows where rating_id = 1, but only count each combiniation of attr1_id and attr2_id only once, and only where there doesn't exist any other row (by other users) that have rating_id > 1 and refer to the same attr1_id and attr2_id. Note that the combination of attr1_id and attr2_id can be switched around, so given these two records:

  id  | attr1_id | attr2_id |      user_id       | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
  20  |       5  |       2  | [email protected]  |         3 |
------+----------+----------+--------------------+-----------+------------------
  21  |       2  |       5  | [email protected]  |         1 |

no row should be counted, as the rows refer to the same combination of attr_ids and one of them has rating_id > 1.

However, if these two rows exist:

  id  | attr1_id | attr2_id |      user_id       | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
  20  |       5  |       2  | [email protected]  |         1 |
------+----------+----------+--------------------+-----------+------------------
  21  |       2  |       5  | [email protected]  |         1 |
------+----------+----------+--------------------+-----------+------------------
  22  |       2  |       5  | [email protected]  |         1 |

all rows should only be counted as one, because they all share the same combination of attr1_id and attr2_id and all have rating_id = 1.

My approach so far is this, but it results in no rows being selected at all.

SELECT *
FROM compatibility c
WHERE rating_id > 1
  AND NOT EXISTs
    (SELECT *
     FROM compatibility c2
     WHERE c.rating_id > 1
       AND (
             (c.attr1_id = c2.attr1_id) AND (c.attr2_id = c2.attr2_id)
             OR
             (c.attr1_id = c2.attr2_id) AND (c.attr2_id = c2.attr1_id)
           )
    )

How can I achieve this?

Erwin Brandstetter · Accepted Answer

My goal is to count the number of rows where rating_id = 1, but only count each combiniation of attr1_id and attr2_id only once, and only where there doesn't exist any other row (by other users) that have rating_id > 1.

Building on your original

Your original query was on the right track to exclude offending rows. You just had > instead of =, and the count was missing, yet.

SELECT count(*) AS ct
FROM  (
   SELECT 1
   FROM   compatibility c
   WHERE  rating_id = 1
   AND    NOT EXISTS (
      SELECT 1
      FROM   compatibility c2
      WHERE  c2.rating_id > 1
      AND   (c2.attr1_id = c.attr1_id AND c2.attr2_id = c.attr2_id OR
             c2.attr1_id = c.attr2_id AND c2.attr2_id = c.attr1_id))
   GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
   ) sub;

Shorter and faster

SELECT count(*) AS ct
FROM  (
   SELECT FROM compatibility  -- empty SELECT list is enough for count(*)
   GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
   HAVING max(rating_id) <= 1
   ) sub;

Similar to this earlier answer with more explanation.
HAVING max(rating_id) <= 1 implements your requirement exactly.

fiddle
_{Old sqlfiddle}

Gordon Linoff · Answer

If I understand correctly, you want pairs of attributes whose ratings are always "1".

This should give you the attributes:

select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
       min(rating_id) as minri, max(rating_id) as maxri
from compatibility c
group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
having min(rating_id) = 1 and max(rating_id) = 1;

To get the count, just use this as a subquery:

select count(*)
from (select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
             min(rating_id) as minri, max(rating_id) as maxri
      from compatibility c
      group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
      having min(rating_id) = 1 and max(rating_id) = 1
     ) c

PostgreSQL - select count(*) for rows where a condition holds

Tags:

sql

select

count

doque

2 Answers

Building on your original

Shorter and faster

Erwin Brandstetter

Gordon Linoff

Recent Activity

Donate For Us

PostgreSQL - select count(*) for rows where a condition holds

Tags:

sql

select

postgresql

count

doque

2 Answers

Building on your original

Shorter and faster

Erwin Brandstetter

Gordon Linoff

Related questions

Recent Activity

Donate For Us