t1
id|entity_type
9|3
9|4
9|5
2|3
2|5
t2
id|entity_type
1|3
1|4
1|5
SELECT t1.id, array_agg(t1.entity_type)
FROM t1
GROUP BY
t1.id
HAVING ARRAY_AGG(t1.entity_type ORDER BY t1.entity_type) =
(SELECT ARRAY_AGG(t2.entity_type ORDER BY t2.entity_type)
FROM t2
WHERE t2.id = 1
GROUP BY t2.id);
Result:
t1.id = 9|array_agg{3,4,5}
I have two tables t1 and t2. I want to get value of t1.id where t1.entity_type array equals t2.entity_type array.
In this scenario everything works fine. For t2.id = 1 I receive t1.id = 9.
Both have the same array of entity_type: {3,4,5}
Now I'd like to get t1.id not only for equal sets, but also for smaller sets.
If I modify t2 this way:
t2
id|entity_type
1|3
1|4
and modify query this way:
SELECT t1.id, array_agg(t1.entity_type)
FROM t1
GROUP BY
t1.id
HAVING ARRAY_AGG(t1.entity_type ORDER BY t1.entity_type) >= /*MODIFICATION*/
(SELECT ARRAY_AGG(t2.entity_type ORDER BY t2.entity_type)
FROM t2
WHERE t2.id = 1
GROUP BY t2.id);
I don't receive the expected result:
t1.id = 1 has {3, 4, 5}
t2.id = 1 has {3, 4}
Arrays in t1 that contain the array in t2 should qualify. I expect to receive results as in first case but I get no rows.
Is there any method like: ARRAY_AGG contains another ARRAY_AGG?
It would be inefficient to use two differing invocations of array_agg(). Use the same (ORDER BY in SELECT list and HAVING clause):
SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM t1
GROUP BY 1
HAVING array_agg(entity_type ORDER BY entity_type) = (
SELECT array_agg(entity_type ORDER BY entity_type)
FROM t2
WHERE id = 1
-- GROUP BY id -- not needed
);
Syntax basics in the manual.
@>Like Nick commented, your 2nd query would work with the "array contains" operator @>
SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM t1
GROUP BY 1
HAVING array_agg(entity_type ORDER BY entity_type) @> (
SELECT array_agg(entity_type ORDER BY entity_type)
FROM t2
WHERE id = 1
);
But this is very inefficient for big tables.
The problem can be cast as a case of relational division. Depending on your table definition, there are more efficient techniques. We have gathered a whole arsenal under this related question:
Assuming (id, entity_type) is unique in both tables, this should be substantially faster for big tables, especially because it can use an index on t1 (as opposed to your original query):
SELECT t1.id
FROM t2
JOIN t1 USING (entity_type)
WHERE t2.id = 1
GROUP BY 1
HAVING count(*) = (SELECT count(*) FROM t2 WHERE id = 1);
You need two indexes:
First on t2(id), typically covered by the primary key.
Second:
CREATE INDEX t1_foo_idx ON t1 (entity_type, id);
The added id column is optional to allow index-only scans. Sequence of columns is essential:
fiddle
Old sqlfiddle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With