Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to migrate an existing Postgres Table to partitioned table as transparently as possible?

I have an existing table in a postgres-DB. For the sake of demonstration, this is how it looks like:

create table myTable(     forDate date not null,     key2 int not null,     value int not null,     primary key (forDate, key2) );  insert into myTable (forDate, key2, value) values     ('2000-01-01', 1, 1),     ('2000-01-01', 2, 1),     ('2000-01-15', 1, 3),     ('2000-03-02', 1, 19),     ('2000-03-30', 15, 8),     ('2011-12-15', 1, 11); 

However in contrast to these few values, myTable is actually HUGE and it is growing continuously. I am generating various reports from this table, but currently 98% of my reports work with a single month and the remaining queries work with an even shorter timeframe. Oftentimes my queries cause Postgres to do table scans over this huge table and I am looking for ways to reduce the problem. Table partitioning seems to fit my problem perfectly. I could just partition my table into months. But how do I turn my existing table into a partitioned table? The manual explicitly states:

It is not possible to turn a regular table into a partitioned table or vice versa

So I need to develop my own migration script, which will analyze the current table and migrate it. The needs are as follows:

  • At design time the time frame which myTable covers is unknown.
  • Each partition should cover one month from the first day of that month to the last day of that month.
  • The table will grow indefinitely, so I have no sane "stop value" for how many tables to generate
  • The result should be as transparent as possible, meaning that I want to touch as little as possible of my existing code. In best case this feels like a normal table which I can insert to and select from without any specials.
  • A database downtime for migration is acceptable
  • Getting along with pure Postgres without any plugins or other things that need to be installed on the server is highly preferred.
  • Database is PostgreSQL 10, upgrading to a newer version will happen sooner or later anyway, so this is an option if it helps

How can I migrate my table to be partitioned?

like image 404
yankee Avatar asked Dec 03 '18 18:12

yankee


People also ask

Can you partition an existing table PostgreSQL?

PostgreSQL allows you to declare that a table is divided into partitions. The table that is divided is referred to as a partitioned table. The declaration includes the partitioning method as described above, plus a list of columns or expressions to be used as the partition key.

Can I partition an existing table?

The steps for partitioning an existing table are as follows: Create filegroups. Create a partition function. Create a partition scheme.

How do I add a partition to a table?

To add the partitioned index of a new data partition to a specific table space location separate from the table space location of the data partition, the partition level INDEX IN clause is added as an option on the ALTER TABLE ADD PARTITION statement.


1 Answers

In Postgres 10 "Declarative Partitioning" was introduced, which can relieve you of a good deal of work such as generating triggers or rules with huge if/else statements redirecting to the correct table. Postgres can do this automatically now. Let's start with the migration:

  1. Rename the old table and create a new partitioned table

    alter table myTable rename to myTable_old;  create table myTable_master(     forDate date not null,     key2 int not null,     value int not null ) partition by range (forDate); 

This should hardly require any explanation. The old table is renamed (after data migration we'll delete it) and we get a master table for our partition which is basically the same as our original table, but without indexes)

  1. Create a function that can generate new partitions as we need them:

    create function createPartitionIfNotExists(forDate date) returns void as $body$ declare monthStart date := date_trunc('month', forDate);     declare monthEndExclusive date := monthStart + interval '1 month';     -- We infer the name of the table from the date that it should contain     -- E.g. a date in June 2005 should be int the table mytable_200506:     declare tableName text := 'mytable_' || to_char(forDate, 'YYYYmm'); begin     -- Check if the table we need for the supplied date exists.     -- If it does not exist...:     if to_regclass(tableName) is null then         -- Generate a new table that acts as a partition for mytable:         execute format('create table %I partition of myTable_master for values from (%L) to (%L)', tableName, monthStart, monthEndExclusive);         -- Unfortunatelly Postgres forces us to define index for each table individually:         execute format('create unique index on %I (forDate, key2)', tableName);     end if; end; $body$ language plpgsql; 

This will come in handy later.

  1. Create a view that basically just delegates to our master table:

    create or replace view myTable as select * from myTable_master; 
  2. Create rule so that when we insert into the rule, we'll not just update out partitioned table, but also create a new partition if needed:

    create or replace rule autoCall_createPartitionIfNotExists as on insert     to myTable     do instead (         select createPartitionIfNotExists(NEW.forDate);         insert into myTable_master (forDate, key2, value) values (NEW.forDate, NEW.key2, NEW.value)     ); 

Of course, if you also need update and delete, you also need a rule for those which should be straight forward.

  1. Actually migrate the old table:

    -- Finally copy the data to our new partitioned table insert into myTable (forDate, key2, value) select * from myTable_old;  -- And get rid of the old table drop table myTable_old; 

Now migration of the table is complete without that there was any need to know how many partitions are needed and also the view myTable will be absolutely transparent. You can simple insert and select from that table as before, but you might get the performance benefit from partitioning.

Note that the view is only needed, because a partitioned table cannot have row triggers. If you can get along with calling createPartitionIfNotExists manually whenever needed from your code, you do not need the view and all it's rules. In this case you need to add the partitions als manually during migration:

do $$ declare rec record; begin     -- Loop through all months that exist so far...     for rec in select distinct date_trunc('month', forDate)::date yearmonth from myTable_old loop         -- ... and create a partition for them         perform createPartitionIfNotExists(rec.yearmonth);     end loop; end $$; 
like image 65
yankee Avatar answered Oct 23 '22 18:10

yankee