Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic table partitioning by string identifier in Postgres

I would like to partition a table in Postgres by previously unknown value. In my scenario that value would be device_id which is a string.

This is current situation:

Table 'device_data' - stores sensor data which is sent from devices, defined by DDL:

CREATE TABLE warehouse.device_data (
  id INTEGER PRIMARY KEY NOT NULL DEFAULT nextval('device_data_id_seq'::regclass),
  device_id TEXT NOT NULL,
  device_data BYTEA NOT NULL,
--   contains additional fields which are omitted for brevity
  received_at TIMESTAMP WITHOUT TIME ZONE DEFAULT now()
);

Table currently holds millions of records and queries are taking huge amount of time. Most of queries contain WHERE device_id='something' clause.

Solution I have in mind is to create table partitions for each device_id.

Is it possible in Postgres to create table partitions for each device_id?

I went through Postgres documentation and couple of examples I found but all of them use fixed boundaries to create partitions. My solution would require:

  1. create new table partition on the fly when new device_id is first encountered
  2. store to an existing partition if the device_id is already known and partition for that device_id already exist

I would like this to be done using table partitions as it would allow querying across multiple device_ids.

like image 901
safar Avatar asked Oct 18 '16 14:10

safar


1 Answers

I like the idea of dynamic partitioning. I do not know though how it will affect the performance as I have never used it.

Change the type of id to int default 0 and manually create the sequence to avoid multiple nextval() calls on a single insert:

create table device_data (
    id int primary key default 0,
    device_id text not null,
    device_data text not null, -- changed for tests
    received_at timestamp without time zone default now()
);
create sequence device_data_seq owned by device_data.id;

Use dynamic sql in the trigger function:

create or replace function before_insert_on_device_data()
returns trigger language plpgsql as $$
begin
    execute format(
        $f$
            create table if not exists %I (
            check (device_id = %L)
            ) inherits (device_data)
        $f$, 
        concat('device_data_', new.device_id), 
        new.device_id);
    execute format(
        $f$
            insert into %I
            values (nextval('device_data_seq'), %L, %L, default)
        $f$, 
        concat('device_data_', new.device_id), 
        new.device_id, 
        new.device_data);
    return null;
end $$;

create trigger before_insert_on_device_data
    before insert on device_data
    for each row execute procedure before_insert_on_device_data();

Test:

insert into device_data (device_id, device_data) values
    ('first', 'data 1'),
    ('second', 'data 1'),
    ('first', 'data 2'),
    ('second', 'data 2');

select * from device_data_first;

 id | device_id | device_data |        received_at         
----+-----------+-------------+----------------------------
  1 | first     | data 1      | 2016-10-18 19:50:40.179955
  3 | first     | data 2      | 2016-10-18 19:50:40.179955
(2 rows)

select * from device_data_second;

 id | device_id | device_data |        received_at         
----+-----------+-------------+----------------------------
  2 | second    | data 1      | 2016-10-18 19:50:40.179955
  4 | second    | data 2      | 2016-10-18 19:50:40.179955
(2 rows)
like image 127
klin Avatar answered Oct 15 '22 02:10

klin