Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

psycopg2: insert multiple rows with one query

People also ask

How do I add multiple rows in PostgreSQL?

PostgreSQL INSERT Multiple Rows First, specify the name of the table that you want to insert data after the INSERT INTO keywords. Second, list the required columns or all columns of the table in parentheses that follow the table name. Third, supply a comma-separated list of rows after the VALUES keyword.

How do you insert multiple rows in Python?

What if you want to insert multiple rows into a table in a single insert query from the Python application. Use the cursor's executemany() function to insert multiple records into a table. Syntax of the executemany() method.


I built a program that inserts multiple lines to a server that was located in another city.

I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:

args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
cur.execute("INSERT INTO table VALUES " + args_str) 

and 2 minutes when using this method:

cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)

New execute_values method in Psycopg 2.7:

data = [(1,'x'), (2,'y')]
insert_query = 'insert into t (a, b) values %s'
psycopg2.extras.execute_values (
    cursor, insert_query, data, template=None, page_size=100
)

The pythonic way of doing it in Psycopg 2.6:

data = [(1,'x'), (2,'y')]
records_list_template = ','.join(['%s'] * len(data))
insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
cursor.execute(insert_query, data)

Explanation: If the data to be inserted is given as a list of tuples like in

data = [(1,'x'), (2,'y')]

then it is already in the exact required format as

  1. the values syntax of the insert clause expects a list of records as in

    insert into t (a, b) values (1, 'x'),(2, 'y')

  2. Psycopg adapts a Python tuple to a Postgresql record.

The only necessary work is to provide a records list template to be filled by psycopg

# We use the data list to be sure of the template length
records_list_template = ','.join(['%s'] * len(data))

and place it in the insert query

insert_query = 'insert into t (a, b) values {}'.format(records_list_template)

Printing the insert_query outputs

insert into t (a, b) values %s,%s

Now to the usual Psycopg arguments substitution

cursor.execute(insert_query, data)

Or just testing what will be sent to the server

print (cursor.mogrify(insert_query, data).decode('utf8'))

Output:

insert into t (a, b) values (1, 'x'),(2, 'y')

Update with psycopg2 2.7:

The classic executemany() is about 60 times slower than @ant32 's implementation (called "folded") as explained in this thread: https://www.postgresql.org/message-id/20170130215151.GA7081%40deb76.aryehleib.com

This implementation was added to psycopg2 in version 2.7 and is called execute_values():

from psycopg2.extras import execute_values
execute_values(cur,
    "INSERT INTO test (id, v1, v2) VALUES %s",
    [(1, 2, 3), (4, 5, 6), (7, 8, 9)])

Previous Answer:

To insert multiple rows, using the multirow VALUES syntax with execute() is about 10x faster than using psycopg2 executemany(). Indeed, executemany() just runs many individual INSERT statements.

@ant32 's code works perfectly in Python 2. But in Python 3, cursor.mogrify() returns bytes, cursor.execute() takes either bytes or strings, and ','.join() expects str instance.

So in Python 3 you may need to modify @ant32 's code, by adding .decode('utf-8'):

args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x).decode('utf-8') for x in tup)
cur.execute("INSERT INTO table VALUES " + args_str)

Or by using bytes (with b'' or b"") only:

args_bytes = b','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
cur.execute(b"INSERT INTO table VALUES " + args_bytes) 

A snippet from Psycopg2's tutorial page at Postgresql.org (see bottom):

A last item I would like to show you is how to insert multiple rows using a dictionary. If you had the following:

namedict = ({"first_name":"Joshua", "last_name":"Drake"},
            {"first_name":"Steven", "last_name":"Foo"},
            {"first_name":"David", "last_name":"Bar"})

You could easily insert all three rows within the dictionary by using:

cur = conn.cursor()
cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)

It doesn't save much code, but it definitively looks better.