Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to bulk Insert in Postgres ignoring all errors that may occur in the process?

Tags:

postgresql

I have to insert a good amount of log records every hour in a table and I would not care about Integrity Errors or Violations that happen in the process.

If I disable autoCommit and do a bulk insert, cursor wont insert anything beyond the row where the transaction failed. Is there a way around this one ?

One hack is to handle this at the application level. I could implement a n-sized buffer and do bulk inserts. If something failed in that transaction, recursively repeat the insert for buffer_first_half + buffer_second_half

def insert(buffer):
    try:
        bulk_insert(buffer)
    except:
        connection.rollback()

        marker = len(buffer)/2

        insert(buffer[:marker])
        insert(buffer[marker:])

But I really hope if it could be achieved using any Postgres' built-in ?

like image 632
meson10 Avatar asked Nov 02 '22 07:11

meson10


1 Answers

PostgreSQL doesn't provide anything built-in for this. You can use SAVEPOINTs, but they're not much better than individual transactions.

Treat each insert as an individual transaction and work on making those tx's faster:

  • SET synchronous_commit = off in your session
  • INSERT into an UNLOGGED table, then INSERT INTO ... SELECT the results into the real table after checking

Here's an earlier closely related answer which also links to more information. I haven't marked this as a duplicate because the other one is specific to upsert-like data loading, you're interested in more general error handling.

like image 175
Craig Ringer Avatar answered Nov 09 '22 08:11

Craig Ringer