Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete Duplicate rows in Vertica database

Tags:

vertica

Vertica allows duplicates to be inserted into the tables. I can view those using the 'analyze_constraints' function. How to delete duplicate rows from Vertica tables?

like image 340
Amit Verma Avatar asked Jun 19 '13 20:06

Amit Verma


1 Answers

You should try to avoid/limit using DELETE with a large number of records. The following approach should be more effective:

Step 1 Create a new table with the same structure / projections as the one containing duplicates:

create table mytable_new like mytable including projections ;

Step 2 Insert into this new table de-duplicated rows:

insert /* +direct */ into mytable_new select <column list> from (
    select * , row_number() over ( partition by <pk column list> ) as rownum from <table-name>
) a where a.rownum = 1 ;

Step 3 rename the original table (the one containing dups):

alter table mytable rename to mytable_orig ;

Step 4 rename the new table:

alter table mytable_new rename to mytable ;

That's all.

like image 63
mauro Avatar answered Oct 12 '22 01:10

mauro