Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DROP TABLE behaviour in Postgres transactions

Say I'm working on an ETL that periodically reads source data from a CSV and replaces a table in Postgres (9.6).

Here is one approach:

BEGIN
DROP TABLE IF EXISTS table
CREATE TABLE table (...)
COPY table FROM STDIN (FORMAT csv, HEADER true)
CREATE INDEX ON table (...)
COMMIT

Here is another:

BEGIN
CREATE TABLE table_temp (...)
COPY table_temp FROM STDIN (FORMAT csv, HEADER true)
CREATE INDEX ON table_temp (...)
DROP TABLE IF EXISTS table
ALTER TABLE table_temp RENAME TO table
COMMIT

Am I correct about the following?

  • The first transaction will lock the table at the DROP command, so concurrent SELECT queries will block until the transaction is complete.
  • The second transaction will not block SELECT queries until after the COPY and CREATE INDEX finish.
  • Both transactions are atomic: in either case, if I issue a ROLLBACK instead of COMMIT, the table will be restored with all original data and indexes.

Also, aside from the name of the index, are there any other functional differences?

like image 926
saltire Avatar asked Jun 06 '17 16:06

saltire


2 Answers

Some info for people who think that renaming a table will be sufficient for their use case. Beware! Renaming in PostgreSQL may not work as you expected sometimes.

More info here What could go wrong, when swapping table contents using table rename in Postgresql?

like image 71
Kivan Avatar answered Oct 02 '22 13:10

Kivan


Yes, all 3 of your assumptions are correct.

The end result would be the same in case of both transactions but the blocking caused by the first transaction would be longer and depend on the amount of data and the time it takes to create the index. The 2nd transaction would require a very brief lock to rename the objects.

like image 29
farrukh Avatar answered Oct 02 '22 12:10

farrukh