I'm using cursor.executemany
to insert batches of rows from CSV files into a SQLite table, some of which are expected to be duplicates based on the primary key field. When I execute the command, I predictably get an Integrity Error and nothing gets inserted.
How do I selectively insert only non-duplicate rows without having to manually filter them out ahead of time? I know in just pure Python you could simply create an error exception and skip the duplicate row--is there something similar I can implement in this use case?
Option 1: Have a unique constraint in your table. You can put the constraint you want directly in your table: CREATE TABLE Permission ( permissionID INTEGER PRIMARY KEY UNIQUE, user INTEGER, location INTEGER unique (user, location) ); This is the most natural option to express your requirement.
Inserting data using pythonImport sqlite3 package. Create a connection object using the connect() method by passing the name of the database as a parameter to it. The cursor() method returns a cursor object using which you can communicate with SQLite3.
insert or ignore ... will insert the row(s) and ignore rows which violation any constraint (other than foreign key constraints).
Simply use INSERT OR IGNORE
to ignore the duplicates.
http://sqlite.org/lang_insert.html
One option is simply writing out the loop manually with an error catch instead of using executemany
.
Pseudocode:
for row in csvfile:
try:
cursor.execute('INSERT INTO X (Y) VALUES (%s)' % row[rowdatapoint])
except IntegrityError:
pass
Probably not as efficient as executemany
, but it will catch your error short of getting into more complicated SQL changes that would possibly involve you pregenerating a giant INSERT
SQL string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With