Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inserting only unique rows into SQLite (python)

Tags:

python

sqlite

I'm using cursor.executemany to insert batches of rows from CSV files into a SQLite table, some of which are expected to be duplicates based on the primary key field. When I execute the command, I predictably get an Integrity Error and nothing gets inserted.

How do I selectively insert only non-duplicate rows without having to manually filter them out ahead of time? I know in just pure Python you could simply create an error exception and skip the duplicate row--is there something similar I can implement in this use case?

like image 588
ChrisArmstrong Avatar asked Dec 01 '12 19:12

ChrisArmstrong


People also ask

When using SQLite If you want to make sure that you do not put duplicates in your new table?

Option 1: Have a unique constraint in your table. You can put the constraint you want directly in your table: CREATE TABLE Permission ( permissionID INTEGER PRIMARY KEY UNIQUE, user INTEGER, location INTEGER unique (user, location) ); This is the most natural option to express your requirement.

How do I insert data into a SQLite database in Python?

Inserting data using pythonImport sqlite3 package. Create a connection object using the connect() method by passing the name of the database as a parameter to it. The cursor() method returns a cursor object using which you can communicate with SQLite3.

What does insert or ignore do SQLite?

insert or ignore ... will insert the row(s) and ignore rows which violation any constraint (other than foreign key constraints).


2 Answers

Simply use INSERT OR IGNORE to ignore the duplicates.

http://sqlite.org/lang_insert.html

like image 142
schlenk Avatar answered Sep 23 '22 02:09

schlenk


One option is simply writing out the loop manually with an error catch instead of using executemany.

Pseudocode:

for row in csvfile:
   try:
       cursor.execute('INSERT INTO X (Y) VALUES (%s)' % row[rowdatapoint])
   except IntegrityError:
       pass

Probably not as efficient as executemany, but it will catch your error short of getting into more complicated SQL changes that would possibly involve you pregenerating a giant INSERT SQL string.

like image 21
jdotjdot Avatar answered Sep 19 '22 02:09

jdotjdot