Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Import skip duplicates

I am trying to do a bulk upload into a SQL server DB. The source file has duplicates which I want to remove, so I was hoping that the operation would automatically upload the first one, then discard the rest. (I've set a unique key constraint). Problem is, the moment a duplicate upload is attempted the whole thing fails and gets rolled back. Is there any way I can just tell SQL to keep going?

like image 280
Karl Avatar asked Jul 02 '26 09:07

Karl


2 Answers

Try to bulk insert the data to the temporary table and then SELECT DISTINCT as @madcolor suggested or

INSERT INTO yourTable
SELECT * FROM #tempTable tt
WHERE NOT EXISTS (SELECT 1 FROM youTable yt WHERE yt.id = tt.id)

or other field in WHERE clause.

like image 154
Lukasz Lysik Avatar answered Jul 04 '26 22:07

Lukasz Lysik


If you're doing this through some SQL tool like SQL Plus or DBVis or Toad, then I suspect not. If you're doing this programatically in a language, then you need to divide and conquer. Presumably executing an update line by line and catching each exception would be too lengthy a process, so instead you could do a batch operation first on the whole SQL block, and if it fails, do it on the first half, and if that fails, do it on the first half of the first half. Iterate this way until you have a block that succeeds. Discard the block and do the same procedure on the rest of the SQL. Anything that violates a constraint will eventually end up as a sole SQL statement which you know to log and discard. This should import with as much bulk processing as is possible while still throwing out the invalid lines.

like image 26
Jherico Avatar answered Jul 04 '26 23:07

Jherico



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!