Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I change the distribution method on an existing Citus table?

Tags:

During a migration from MySQL into a Citus cluster, I used the range distribution method. The migration is complete, but now I'd like to change the distribution method to hash.

Is there a way to change the distribution method from range to hash for an existing table with data already in it?

I came up with the following procedure, but am not sure it's valid:

  1. Update the minvalue and maxvalue columns of the pg_dist_shard table for all shards being changed
  2. Update the shard storage type column of the pg_dist_partition table from r to h
  3. COMMIT;
like image 618
Eugen Konkov Avatar asked Apr 28 '16 07:04

Eugen Konkov


1 Answers

That is a good question. Currently, Citus does not provide a direct way to change partition type of existing data.

In range partitioning, records are placed in shards according to their partition column value and shard min/max values. If a record x resides in shard y, then it means y.minvalue <= x.partition_column <= y.maxvalue.

In hash partitioning, the partition column is hashed and records are routed according to this hashed value. Therefore, min/max values you see in pg_dist_shard are the boundary values for the result of the hash function. In this case y.minvalue <= hash(x.partition_column) <= y.maxvalue.

Therefore, doing the changes you have mentioned would end up with an incorrect distribution. In order to switch from range partition to hash partition, the data should be re-distributed. To do that, I suggest reloading the data to an empty hash-partitioned table.

For more information, you can refer to Working with Distributed Tables and Hash Distribution sections of Citus Documentation.

like image 78
Ahmet Eren Başak Avatar answered Sep 28 '22 04:09

Ahmet Eren Başak