Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding this SQL Query

I'm new to oracle database, can some help me understand this query. This query eliminates duplicates from table.

DELETE FROM table_name A 
 WHERE ROWID > (SELECT min(rowid) 
                  FROM table_name B 
                 WHERE A.key_values = B.key_values);

Any suggestions for improving the query are welcome.

Edit: No this is not homework , what I didn't understand is, what is being done by subquery and what does ROWID > On subquery do ?

This is the Source of the query

like image 680
SuperMan Avatar asked Feb 21 '11 03:02

SuperMan


2 Answers

Dissecting the actual mechanics:

DELETE FROM table_name A 

This is a standard query to delete records from the table named "table_name". Here, it has been aliased as "A" to be referred to in the subquery.

WHERE ROWID > 

This places a condition on the deletion, such that for each row encountered, the ROWID must meed a condition of being greater than..

            (SELECT min(rowid) 
              FROM table_name B 
             WHERE A.key_values = B.key_values)

This is a subquery that is correlated to the main DELETE statement. It uses the value A.key_values from the outside query. So given a record from the DELETE statement, it will run this subquery to find the minimum rowid (internal record id) for all records in the same table (aliased as B now) that bear the same key_values value.

So, to put it together, say you had these rows

rowid   |  key_values
=======    ============
1          A
2          B
3          B
4          C
5          A
6          B

The subquery works out that the min(rowid) for each record based on ALL records with the same key_values is:

rowid   |  key_values    | min(rowid)
=======    ============    ===========
1          A               1
2          B               2
3          B               2  **
4          C               4
5          A               1  **
6          B               2  **

For the records marked with **, the condition

WHERE ROWID > { subquery }

becomes true, and they are deleted.

EDIT - additional info

This answer previously stated that ROWID increased by insertion order. That is very untrue. The truth is that rowid is just a file.block.slot-on-block - a physical address.

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:53140678334596

Tom's Followup December 1, 2008 - 6am Central time zone:

it is quite possible that D will be "first" in the table - as it took over A's place.

If rowids always "grew", than space would never be reused (that would be an implication of rowids growing always - we would never be able to reuse old space as the rowid is just a file.block.slot-on-block - a physical address)

like image 62
RichardTheKiwi Avatar answered Nov 15 '22 11:11

RichardTheKiwi


Rowid is a pseudo-column that uniquely identifies each row in a table; it is numeric.

This query finds all rows in A where A.key_values = B.key_values and delete all of them but one with the minimal rowid. It's just a way to arbitrarily choose one duplicate to preserve.

like image 30
9000 Avatar answered Nov 15 '22 12:11

9000