I'm trying to use the COPY command to insert data from a file into PGSQL via Python. This works incredibly well when the target table is empty or I ensure ahead of time there will be no unique key collisions:
cmd = ("COPY %s (%s) FROM STDIN WITH (FORMAT CSV, NULL '_|NULL|_')" %
(tableName, colStr))
cursor.copy_expert(cmd, io)
I'd prefer however to be able to perform this COPY command without first emptying the table. Is there any way to do an 'INSERT or UPDATE' type operation with SQL COPY?
COPY moves data between PostgreSQL tables and standard file-system files. COPY TO copies the contents of a table to a file, while COPY FROM copies data from a file to a table (appending the data to whatever is in the table already). COPY TO can also copy the results of a SELECT query.
The COPY FROM command operates much faster than a normal INSERT command because the data is read as a single transaction directly to the target table.
If you COPY data into a table already containing data, the new data will be appended. If you COPY TO a file already containing data, the existing data will be overwritten.
) ENCODING ' encoding_name ' COPY moves data between PostgreSQL tables and standard file-system files. COPY TO copies the contents of a table to a file, while COPY FROM copies data from a file to a table (appending the data to whatever is in the table already). COPY TO can also copy the results of a SELECT query.
Define the UPDATE statement query to update the data of the PostgreSQL table. Execute the UPDATE query using a cursor.execute () Close the cursor and database connection. Now, Let see the example to update a single row of the database table. Verify the result of the above update operation by Selecting data from the PostgreSQL table using Python.
A useful technique within PostgreSQL is to use the COPY command to insert values directly into tables from external files. Files used for input by COPY must either be in standard ASCII text format, whose fields are delimited by a uniform symbol, or in PostgreSQL’s binary table format. Common delimiters for ASCII files are tabs and commas.
Install psycopg2 using pip. Second, Establish a PostgreSQL database connection in Python. Next, Define the Insert query. All you need to know is the table’s column details. Execute the INSERT query using cursor.execute (). In return, you will get the number of rows affected.
Not directly through the copy command.
What you can do however is create a temporary table, populate that table with the copy command, and then do your insert and update from that.
-- Clone table stucture of target table
create temporary table __copy as (select * from my_schema.my_table limit 0);
-- Copy command goes here...
-- Update existing records
update
my_schema.my_table
set
column_2 = __copy.column_2
from
__copy
where
my_table.column_1 = __copy.column_1;
-- Insert new records
insert into my_schema.my_table (
column_1,
column_2
) (
select
column_1,
column_2
from
__copy
left join my_schema.my_table using(column_1)
where
my_table is null
);
You might consider creating an index on __copy after populating it with data to speed the update query up.
Consider using a temp table as staging table that receives csv file data. Then, run an append into final table using Postgres' CONFLICT (colname) DO UPDATE ...
. Available in version 9.3+. See docs. Do note that the special excluded table is used to reference values originally proposed for insertion.
Also, assuming you use pyscopg2, consider using sql.Identifier() to safely bind identifiers like table or column names. However, you would need to decompose colStr to wrap individual items:
from psycopg2 import sql
...
cursor.execute("DELETE FROM tempTable")
conn.commit()
cmd = sql.SQL("COPY {0} ({1}) FROM STDIN WITH (FORMAT CSV, NULL '_|NULL|_'))")\
.format(sql.Identifier(temptableName),
sql.SQL(', ').join([sql.Identifier('col1'),
sql.Identifier('col2'),
sql.Identifier('col3')]))
cursor.copy_expert(cmd, io)
sql = "INSERT INTO finalTable (id_column, Col1, Col2, Col3)" + \
" SELECT id_column, Col1, Col2, Col3 FROM tempTable t" + \
" ON CONFLICT (id_column) DO UPDATE SET Col1 = EXCLUDED.Col1," + \
" Col2 = EXCLUDED.Col2," + \
" Col3 = EXCLUDED.Col3 ...;"
cursor.execute(sql)
conn.commit()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With