Improve PostgreSQL query performance

Question

When running this query in my server it's very slow, and I can't understand why. Can anyone help me figure it out?
Query:

SELECT
    "t_dat"."t_year" AS "c0",
    "t_dat"."t_month" AS "c1",
    "t_dat"."t_week" AS "c2",
    "t_dat"."t_day" AS "c3",
    "t_purs"."p_id" AS "c4",
    sum("t_purs"."days") AS "m0",
    sum("t_purs"."timecreated") AS "m1"
FROM "t_dat", "t_purs"
WHERE "t_purs"."created" = "t_dat"."t_key"
  AND "t_dat"."t_year" = 2013
  AND "t_dat"."t_month" = 3
  AND "t_dat"."t_week" = 9
  AND "t_dat"."t_day" IN (1,2)
  AND "t_purs"."p_id" IN (
      '4','15','18','19','20','29',
      '31','35','46','56','72','78')
GROUP BY
    "t_dat"."t_year",
    "t_dat"."t_month",
    "t_dat"."t_week",
    "t_dat"."t_day",
    "t_purs"."p_id"

Explain Analyze:

HashAggregate  (cost=12252.04..12252.04 rows=1 width=28) (actualtime=10212.374..10212.384 rows=10 loops=1)
  ->  Nested Loop  (cost=0.00..12252.03 rows=1 width=28) (actual time=3016.006..10212.249 rows=14 loops=1)
        Join Filter: (t_dat.t_key = t_purs.created)
        ->  Seq Scan on t_dat  (cost=0.00..129.90 rows=1 width=20) (actual time=0.745..2.040 rows=48 loops=1)
              Filter: ((t_day = ANY ('{1,2}'::integer[])) AND (t_year = 2013) AND (t_month = 3) AND (t_week = 9))
        ->  Seq Scan on t_purs  (cost=0.00..12087.49 rows=9900 width=16) (actual time=0.018..201.630 rows=14014 loops=48)
              Filter: (p_id = ANY ('{4,15,18,19,20,29,31,35,46,56,72,78}'::integer[]))
Total runtime: 10212.470 ms

mvp · Accepted Answer

It is difficult to say what exactly you are missing, but if I were you, I would make sure that following index exists:

CREATE INDEX t_dat_id_date_idx
    ON t_dat (t_key, t_year, t_month, t_week, t_day);

For t_purs, create this index:

CREATE INDEX t_purs_created_p_id_idx
    ON t_purs (created, p_id);

Erwin Brandstetter · Answer

Consider using a single column in your table:

t_date date

instead of (t_year, t_month, t_week, t_day). The data type date occupies 4 byte. That would shrink your table a bit, make the index smaller and faster and grouping a lot easier.

Year, month, week and day can easily and quickly be extracted from a date with extract(). Your query could then look like this and would be faster:

SELECT extract (year  FROM t_date) AS c0
      ,extract (month FROM t_date) AS c1
      ,extract (week  FROM t_date) AS c2
      ,extract (day   FROM t_date) AS c3
      ,p.p_id                      AS c4
      ,sum(p.days)                 AS m0
      ,sum(p.timecreated)          AS m1
FROM   t_dat  d
JOIN   t_purs p ON p.created = d.t_key
WHERE  d.t_date IN ('2013-03-01'::date, '2013-03-02'::date)
AND    p.p_id IN (4,15,18,19,20,29,31,35,46,56,72,78)
GROUP  BY d.t_date, p.p_id;

More important for performance is the index, which would then simply be:

CREATE INDEX t_dat_date_idx ON t_dat (t_key, t_date);

Or, depending on data distribution:

CREATE INDEX t_dat_date_idx ON t_dat (t_date, t_key);

The sequence of column matters. You may even create both.

Pratyush Srivastava · Answer

enter image description here

Your query is having sequential scans on t_purs and t_dat. Creating the appropriate indexes will help you make this query faster and avoid sequential scans.

create index index_name on t_purs(created) where created is not null;
create index index_name on t_dat using btree(t_key, t_year, t_month, t_week, t_day)

Run explain analyze after running the above two queries. You'll see the plan time and execution time will be reduced.

Improve PostgreSQL query performance

Tags:

performance

sql

postgresql

database-design

Eli_Rozen

3 Answers

mvp

Erwin Brandstetter

Pratyush Srivastava

Recent Activity

Donate For Us

Improve PostgreSQL query performance

Tags:

performance

sql

postgresql

database-design

Eli_Rozen

3 Answers

mvp

Erwin Brandstetter

Pratyush Srivastava

Related questions

Recent Activity

Donate For Us