Amazon Redshift Keys are not enforced - how to prevent duplicate data?

Tags:

Just testing out AWS Redshift, and having discovered some dupe data on an insert that I'd hoped would just fail on duplication in the key column, reading the docs reveal that primary key constraints aren't "enforced".

Anyone figured out how to prevent duplication on primary key (per "traditional" expectation).

Thanks to any Redshift pioneers!

767

asked Mar 02 '13 04:03

Saeven

1 Answers

I assign UUIDs when the records are created. If the record is inherently unique, I use type 4 UUIDs (random), and when they aren't I use type 5 (SHA-1 hash) using the natural keys as input.
Then you can follow this instruction by AWS very easily to perform UPSERTs. If your input has duplicates, you should be able to clean up by issuing a SQL that looks something like this in your staging table:

CREATE TABLE cleaned AS
SELECT
  pk_field,
  field_1,
  field_2,
  ...  
FROM (
       SELECT
         ROW_NUMBER() OVER (PARTITION BY pk_field order by pk_field) AS r,
       t.*
       from table1 t
     ) x
where x.r = 1

answered Oct 09 '22 20:10

Enno Shioji

Related questions
                            
                                CUDA Blocks & Warps
                            
                                Support Vector Machine for Java?
                            
                                App crashes when restoring from background after a long time
                            
                                Adding Node.js (for real-time notifications) to an existing PHP application
                            
                                Combine Sliding and Absolute Expiration
                            
                                Alternatives to using Thread.Sleep for waiting
                            
                                How to use external DLLs in CMake project
                            
                                PHP + MySQL: Difference between buffered and unbuffered queries
                            
                                Visual Studio/ NuGet missing references
                            
                                What's the difference between @Named and @Qualifier in spring
                            
                                Disable Gradle auto make in Android Studio
                            
                                Cannot export data to a file in R (write.csv)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With