Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is table-swapping in Postgres so verbose?

I'd like to backfill a column of a large (20M rows), frequently-read but rarely-written table. From various articles and questions on SO, it seems like the best way to do this is create a table with identical structure, load in the backfilled data, and live-swap (since renaming is pretty quick). Sounds good!

But when I actually write the script to do this, it is mind-blowingly long. Here's a taste:

BEGIN;
  CREATE TABLE foo_new (LIKE foo);
  -- I don't use INCLUDING ALL, because that produces Indexes/Constraints with different names

  -- This is the only part of the script that is specific to my case.
  -- Everything else is standard for any table swap
  INSERT INTO foo_new (id, first_name, last_name, email, full_name)
    (SELECT id, first_name, last_name, email, first_name || last_name) FROM foo);

  CREATE SEQUENCE foo_new_id_seq
    START 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
  SELECT setval('foo_new_id_seq', COALESCE((SELECT MAX(id)+1 FROM foo_new), 1), false);
  ALTER SEQUENCE foo_new_id_seq OWNED BY foo_new.id;
  ALTER TABLE ONLY foo_new ALTER COLUMN id SET DEFAULT nextval('foo_new_id_seq'::regclass);
  ALTER TABLE foo_new
    ADD CONSTRAINT foo_new_pkey
    PRIMARY KEY (id);
COMMIT;

-- Indexes are made concurrently, otherwise they would block reads for
-- a long time. Concurrent index creation cannot occur within a transaction.
CREATE INDEX CONCURRENTLY foo_new_on_first_name ON foo_new USING btree (first_name);
CREATE INDEX CONCURRENTLY foo_new_on_last_name ON foo_new USING btree (last_name);
CREATE INDEX CONCURRENTLY foo_new_on_email ON foo_new USING btree (email);
-- One more line for each index

BEGIN;
  ALTER TABLE foo RENAME TO foo_old;
  ALTER TABLE foo_new RENAME TO foo;

  ALTER SEQUENCE foo_id_seq RENAME TO foo_old_id_seq;
  ALTER SEQUENCE foo_new_id_seq RENAME TO foo_id_seq;

  ALTER TABLE foo_old RENAME CONSTRAINT foo_pkey TO foo_old_pkey;
  ALTER TABLE foo RENAME CONSTRAINT foo_new_pkey TO foo_pkey;

  ALTER INDEX foo_on_first_name RENAME TO foo_old_on_first_name;
  ALTER INDEX foo_on_last_name RENAME TO foo_old_on_last_name;
  ALTER INDEX foo_on_email RENAME TO foo_old_on_email;
  -- One more line for each index

  ALTER INDEX foo_new_on_first_name RENAME TO foo_on_first_name;
  ALTER INDEX foo_new_on_last_name RENAME TO foo_on_last_name;
  ALTER INDEX foo_new_on_email RENAME TO foo_on_email;
  -- One more line for each index
COMMIT;

-- TODO: drop old table (CASCADE)

And this doesn't even include foreign keys, or other constraints! Since the only part of this that is specific to my case in the INSERT INTO bit, I'm surprised that there's no built-in Postgres function to do this sort of swapping. Is this operation less common than I make it out to be? Am I underestimating the variety of ways this can be accomplished? Is my desire to keep naming consistent an atypical one?

like image 376
Simon Lepkin Avatar asked Nov 23 '16 20:11

Simon Lepkin


Video Answer


1 Answers

It's probably not all that common. Most tables aren't big enough to warrant it, and most applications can tolerate some amount of downtime here and there.

More importantly, different applications can afford to cut corners in different ways depending on their workload. The database server can't; it needs to handle (or to very deliberately not handle) every possible obscure edge-case, which is likely a lot harder than you might expect. Ultimately, writing tailored solutions for different use cases probably makes more sense.

Anyway, if you're just trying to implement a calculated field as first_name || last_name, there are better ways of doing it:

ALTER TABLE foo RENAME TO foo_base;
CREATE VIEW foo AS
  SELECT
    id,
    first_name,
    last_name,
    email,
    (first_name || last_name) AS full_name
  FROM foo_base;

Assuming that your real case is more complicated, all of this effort may still be unnecessary. I believe that the copy-and-rename approach is largely based on the assumption that you need to lock the table against concurrent modifications for the duration of this process, and so the goal is to get it done as quickly as possible. If all concurrent operations are read-only - which appears to be the case, since you're not locking the table - then you're probably better off with a simple UPDATE (which won't block SELECTs), even if it does take a bit longer (though it does have the advantage of avoiding foreign key re-checks and TOAST table rewrites).

If this approach really is justified, I think there a few opportunities for improvement:

  • You don't need to recreate/reset the sequence; you can just link the existing sequence to the new table.
  • CREATE INDEX CONCURRENTLY seems unnecessary, as nobody else should be trying to access foo_new yet. In fact, if the whole script were in one transaction, it wouldn't even be externally visible at this point.
  • Table names only need to be unique within a schema. If you temporarily create a schema for the new table, you should be able to replace all of those RENAMEs with a single ALTER TABLE foo SET SCHEMA public.
  • Even if you don't expect concurrent writes, it wouldn't hurt to LOCK foo IN SHARE MODE anyway...

EDIT:

The sequence reassignment is a little more involved than I expected, as it seems that they need to stay in the same schema as their parent table. But here is (what appears to be) a working example:

BEGIN;
  LOCK public.foo IN SHARE MODE;
  CREATE SCHEMA tmp;
  CREATE TABLE tmp.foo (LIKE public.foo);

  INSERT INTO tmp.foo (id, first_name, last_name, email, full_name)
    SELECT id, first_name, last_name, email, (first_name || last_name) FROM public.foo;

  ALTER TABLE tmp.foo ADD CONSTRAINT foo_pkey PRIMARY KEY (id);
  CREATE INDEX foo_on_first_name ON tmp.foo (first_name);
  CREATE INDEX foo_on_last_name ON tmp.foo (last_name);
  CREATE INDEX foo_on_email ON tmp.foo (email); 
  ALTER TABLE tmp.foo ALTER COLUMN id SET DEFAULT nextval('public.foo_id_seq'); 

  ALTER SEQUENCE public.foo_id_seq OWNED BY NONE;
  DROP TABLE public.foo;

  ALTER TABLE tmp.foo SET SCHEMA public;
  ALTER SEQUENCE public.foo_id_seq OWNED BY public.foo.id;
  DROP SCHEMA tmp;
COMMIT;
like image 171
Nick Barnes Avatar answered Sep 29 '22 07:09

Nick Barnes