When running this query in my server it's very slow, and I can't understand why. Can anyone help me figure it out?
Query:
SELECT
"t_dat"."t_year" AS "c0",
"t_dat"."t_month" AS "c1",
"t_dat"."t_week" AS "c2",
"t_dat"."t_day" AS "c3",
"t_purs"."p_id" AS "c4",
sum("t_purs"."days") AS "m0",
sum("t_purs"."timecreated") AS "m1"
FROM "t_dat", "t_purs"
WHERE "t_purs"."created" = "t_dat"."t_key"
AND "t_dat"."t_year" = 2013
AND "t_dat"."t_month" = 3
AND "t_dat"."t_week" = 9
AND "t_dat"."t_day" IN (1,2)
AND "t_purs"."p_id" IN (
'4','15','18','19','20','29',
'31','35','46','56','72','78')
GROUP BY
"t_dat"."t_year",
"t_dat"."t_month",
"t_dat"."t_week",
"t_dat"."t_day",
"t_purs"."p_id"
Explain Analyze:
HashAggregate (cost=12252.04..12252.04 rows=1 width=28) (actualtime=10212.374..10212.384 rows=10 loops=1) -> Nested Loop (cost=0.00..12252.03 rows=1 width=28) (actual time=3016.006..10212.249 rows=14 loops=1) Join Filter: (t_dat.t_key = t_purs.created) -> Seq Scan on t_dat (cost=0.00..129.90 rows=1 width=20) (actual time=0.745..2.040 rows=48 loops=1) Filter: ((t_day = ANY ('{1,2}'::integer[])) AND (t_year = 2013) AND (t_month = 3) AND (t_week = 9)) -> Seq Scan on t_purs (cost=0.00..12087.49 rows=9900 width=16) (actual time=0.018..201.630 rows=14014 loops=48) Filter: (p_id = ANY ('{4,15,18,19,20,29,31,35,46,56,72,78}'::integer[])) Total runtime: 10212.470 ms
It is difficult to say what exactly you are missing, but if I were you, I would make sure that following index exists:
CREATE INDEX t_dat_id_date_idx
ON t_dat (t_key, t_year, t_month, t_week, t_day);
For t_purs
, create this index:
CREATE INDEX t_purs_created_p_id_idx
ON t_purs (created, p_id);
Consider using a single column in your table:
t_date date
instead of (t_year, t_month, t_week, t_day)
. The data type date
occupies 4 byte. That would shrink your table a bit, make the index smaller and faster and grouping a lot easier.
Year, month, week and day can easily and quickly be extracted from a date with extract()
. Your query could then look like this and would be faster:
SELECT extract (year FROM t_date) AS c0
,extract (month FROM t_date) AS c1
,extract (week FROM t_date) AS c2
,extract (day FROM t_date) AS c3
,p.p_id AS c4
,sum(p.days) AS m0
,sum(p.timecreated) AS m1
FROM t_dat d
JOIN t_purs p ON p.created = d.t_key
WHERE d.t_date IN ('2013-03-01'::date, '2013-03-02'::date)
AND p.p_id IN (4,15,18,19,20,29,31,35,46,56,72,78)
GROUP BY d.t_date, p.p_id;
More important for performance is the index, which would then simply be:
CREATE INDEX t_dat_date_idx ON t_dat (t_key, t_date);
Or, depending on data distribution:
CREATE INDEX t_dat_date_idx ON t_dat (t_date, t_key);
The sequence of column matters. You may even create both.
Your query is having sequential scans on t_purs
and t_dat
. Creating the appropriate indexes will help you make this query faster and avoid sequential scans.
create index index_name on t_purs(created) where created is not null;
create index index_name on t_dat using btree(t_key, t_year, t_month, t_week, t_day)
Run explain analyze after running the above two queries. You'll see the plan time and execution time will be reduced.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With