My Postgres table schema has two fields:
items, which contains an array of (integer) item IDs . If the same ID is in the array twice, it is considered a primary item.
primary_items, another array of (integer) item IDs that I just added to the schema, so it is currently empty across all rows.
What I need to do is: for each row, check if there are duplicate IDs in items, and if so, put one of each duplicate ID in the primary_items field.
Any idea of how to approach this with a query? Ideally, I wouldn't have to write a helper program (Nodejs) to do this, but I can if needed.
Example:
Current:
documents_table
items primary_items
------------ -----------
{1, 2, 2, 4} {}
{1, 2, 3} {}
{3, 3} {}
{5, 4, 5, 4} {}
Desired:
documents_table
items primary_items
------------ -----------
{1, 2, 2, 4} {2}
{1, 2, 3} {}
{3, 3} {3}
{5, 4, 5, 4} {5,4}
You can find duplicate elements with this simple query:
select array_agg(item)
from (
select item
from unnest('{5, 4, 5, 4}'::int[]) as item
group by item
having count(*) > 1
) s
array_agg
-----------
{4,5}
(1 row)
Use the query in a function:
create or replace function find_primary_items(arr anyarray)
returns anyarray language sql immutable
as $$
select array_agg(item)
from (
select item
from unnest(arr) as item
group by item
having count(*) > 1
) s
$$;
Now the update is really easy:
update documents_table
set primary_items = coalesce(find_primary_items(items), '{}')
returning *;
items | primary_items
-----------+---------------
{1,2,2,4} | {2}
{1,2,3} | {}
{3,3} | {3}
{5,4,5,4} | {4,5}
(4 rows)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With