Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to update selected rows with values from a CSV file in Postgres?

I'm using Postgres and would like to make a big update query that would pick up from a CSV file, lets say I got a table that's got (id, banana, apple).

I'd like to run an update that changes the Bananas and not the Apples, each new Banana and their ID would be in a CSV file.

I tried looking at the Postgres site but the examples are killing me.

like image 563
user519753 Avatar asked Jan 18 '12 12:01

user519753


People also ask

Which command is used to modify the rows in PostgreSQL?

In PostgreSQL, the UPDATE command is used to change the present records in a table. To update the selected rows, we have to use the WHERE clause; otherwise, all rows would be updated.

How do I populate a table in PostgreSQL?

There are generally three methods in PostgreSQL with which you can fill a table with data: Use the INSERT INTO command with a grouped set of data to insert new values. Use the INSERT INTO command in conjunction with a SELECT statement to insert existing values from another table.


1 Answers

COPY the file to a temporary staging table and update the actual table from there. Like:

CREATE TEMP TABLE tmp_x (id int, apple text, banana text); -- but see below  COPY tmp_x FROM '/absolute/path/to/file' (FORMAT csv);  UPDATE tbl SET    banana = tmp_x.banana FROM   tmp_x WHERE  tbl.id = tmp_x.id;  DROP TABLE tmp_x; -- else it is dropped at end of session automatically 

If the imported table matches the table to be updated exactly, this may be convenient:

CREATE TEMP TABLE tmp_x AS SELECT * FROM tbl LIMIT 0; 

Creates an empty temporary table matching the structure of the existing table, without constraints.

Privileges

Up to Postgres 10, SQL COPY requires superuser privileges for this.
In Postgres 11 or later, there are also some predefined roles (formerly "default roles") to allow it. The manual:

COPY naming a file or command is only allowed to database superusers or users who are granted one of the roles pg_read_server_files, pg_write_server_files, or pg_execute_server_program [...]

The psql meta-command \copy works for any db role. The manual:

Performs a frontend (client) copy. This is an operation that runs an SQL COPY command, but instead of the server reading or writing the specified file, psql reads or writes the file and routes the data between the server and the local file system. This means that file accessibility and privileges are those of the local user, not the server, and no SQL superuser privileges are required.

The scope of temporary tables is limited to a single session of a single role, so the above has to be executed in the same psql session:

CREATE TEMP TABLE ...; \copy tmp_x FROM '/absolute/path/to/file' (FORMAT csv); UPDATE ...; 

If you are scripting this in a bash command, be sure to wrap it all in a single psql call. Like:

echo 'CREATE TEMP TABLE tmp_x ...; \copy tmp_x FROM ...; UPDATE ...;' | psql 

Normally, you need the meta-command \\ to switch between psql meta commands and SQL commands in psql, but \copy is an exception to this rule. The manual again:

special parsing rules apply to the \copy meta-command. Unlike most other meta-commands, the entire remainder of the line is always taken to be the arguments of \copy, and neither variable interpolation nor backquote expansion are performed in the arguments.

Big tables

If the import-table is big it may pay to increase temp_buffers temporarily for the session (first thing in the session):

SET temp_buffers = '500MB';  -- example value 

Add an index to the temporary table:

CREATE INDEX tmp_x_id_idx ON tmp_x(id); 

And run ANALYZE manually, since temporary tables are not covered by autovacuum / auto-analyze.

ANALYZE tmp_x; 

Related answers:

  • Best way to delete millions of rows by ID
  • How can I insert common data into a temp table from disparate schemas?
  • How to delete duplicate entries?
like image 194
Erwin Brandstetter Avatar answered Oct 08 '22 15:10

Erwin Brandstetter