Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use the physical location of rows (ROWID) in a DELETE statement

I have a table that has a lot of duplicated rows and no primary key.
I want to remove just the duplicated records, but when I try to do this it would remove all peers.

How can I find the ROWID from a table in Postgres?

like image 210
endrigoantonini Avatar asked Dec 27 '22 08:12

endrigoantonini


2 Answers

Simplify this by one query level:

DELETE FROM table_name
WHERE  ctid NOT IN (
   SELECT min(ctid)
   FROM   table_name
   GROUP  BY $other_columns);

.. where duplicates are defined by equality in $other_columns.
There is no need to include columns from the GROUP BY clause in the SELECT list, so you don't need another subquery.

ctid in the current manual.

like image 150
Erwin Brandstetter Avatar answered Dec 29 '22 11:12

Erwin Brandstetter


On PostgreSQL the physical location of the row is called CTID.

So if you want to view it use a QUERY like this:

SELECT CTID FROM table_name

To use it on a DELETE statement to remove the duplicated records use it like this:

DELETE FROM table_name WHERE CTID NOT IN (
  SELECT RECID FROM 
    (SELECT MIN(CTID) AS RECID, other_columns 
      FROM table_name GROUP BY other_columns) 
  a);

Remember that table_name is the desired table and other_columns are the columns that you want to use to filter that.

Ie:

DELETE FROM user_department WHERE CTID NOT IN (
  SELECT RECID FROM 
    (SELECT MIN(CTID) AS RECID, ud.user_id, ud.department_id
      FROM user_department ud GROUP BY ud.user_id, ud.department_id) 
  a);
like image 26
endrigoantonini Avatar answered Dec 29 '22 11:12

endrigoantonini