2013-05-29: Updated question with latest configuration and extra info. Earlier I was testing in a virtualbox image. Now I am testing on the productive server, which reflects the real world much better. Question should be fully clear now. If you have helped me before, read again carefully
Currently I have found a query that is very slow in PostgreSQL, though I do not understand how it can be slow. I scaled it down a bit so it's much smaller to post here (and much faster, but still slow!).
Little background: In this project I have adverts that belong to users. Users are part of an area within the country. An area can have multiple child areas, so the area table is a tree. A network is assigned to an area. When filtering on a network it should filter on that area and all its area childs in the tree. Because I can't query against a endless tree I have table that flattens this full tree.
So with 1 query (SELECT area_id FROM network_area_flatdeep WHERE network_id = 1)
I get all areas that belong to network 1:
63, 64, 65, 66, 67, 68, 69, 70
This makes querying very easy.
The slow query (Before testing everything has been VACUUM ANALYZED):
EXPLAIN ANALYZE SELECT a0_.id AS id0
FROM advert a0_
INNER JOIN member m6_
ON a0_.user_id = m6_.id
INNER JOIN area a7_
ON m6_.area_id = a7_.id
WHERE a0_.status IN ( 1 )
AND m6_.status IN ( 1 )
AND a7_.id IN (SELECT area_id FROM network_area_flatdeep WHERE network_id IN (1))
ORDER BY a0_.created_date DESC
LIMIT 60;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=41.10..34695.53 rows=60 width=12) (actual time=9.327..134.581 rows=60 loops=1)
-> Nested Loop (cost=41.10..2291276.69 rows=3967 width=12) (actual time=9.326..134.534 rows=60 loops=1)
Join Filter: (a7_.id = m6_.area_id)
Rows Removed by Join Filter: 22566
-> Nested Loop (cost=41.10..821630.16 rows=317633 width=24) (actual time=0.049..39.638 rows=22666 loops=1)
-> Index Scan Backward using advert_created_date_idx on advert a0_ (cost=0.00..762000.64 rows=317633 width=16) (actual time=0.013..4.357 rows=2834 loops=1)
Filter: (status = 1)
Rows Removed by Filter: 21
-> Materialize (cost=41.10..73.38 rows=15 width=8) (actual time=0.000..0.004 rows=8 loops=2834)
-> Nested Loop (cost=41.10..73.30 rows=15 width=8) (actual time=0.031..0.073 rows=8 loops=1)
-> HashAggregate (cost=41.10..41.18 rows=8 width=4) (actual time=0.023..0.026 rows=8 loops=1)
-> Bitmap Heap Scan on network_area_flatdeep (cost=4.37..41.06 rows=15 width=4) (actual time=0.011..0.015 rows=8 loops=1)
Recheck Cond: (network_id = 1)
-> Bitmap Index Scan on idx_c29e880034128b91 (cost=0.00..4.36 rows=15 width=0) (actual time=0.007..0.007 rows=8 loops=1)
Index Cond: (network_id = 1)
-> Index Only Scan using area_pkey on area a7_ (cost=0.00..4.01 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=8)
Index Cond: (id = network_area_flatdeep.area_id)
Heap Fetches: 8
-> Index Scan using member_pkey on member m6_ (cost=0.00..4.61 rows=1 width=8) (actual time=0.002..0.003 rows=1 loops=22666)
Index Cond: (id = a0_.user_id)
Filter: (status = 1)
Rows Removed by Filter: 0
Total runtime: 134.698 ms
(23 rows)
The subquery itself is:
EXPLAIN ANALYZE SELECT area_id FROM network_area_flatdeep WHERE network_id IN (1);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on network_area_flatdeep (cost=4.37..41.06 rows=15 width=4) (actual time=0.020..0.024 rows=8 loops=1)
Recheck Cond: (network_id = 1)
-> Bitmap Index Scan on idx_c29e880034128b91 (cost=0.00..4.36 rows=15 width=0) (actual time=0.012..0.012 rows=8 loops=1)
Index Cond: (network_id = 1)
Total runtime: 0.051 ms
(5 rows)
which results in : 63, 64, 65, 66, 67, 68, 69, 70
So I've tried to hard code in the ids. And expected it to be faster (but is not):
EXPLAIN ANALYZE SELECT a0_.id AS id0
FROM advert a0_
INNER JOIN member m6_
ON a0_.user_id = m6_.id
INNER JOIN area a7_
ON m6_.area_id = a7_.id
WHERE a0_.status IN ( 1 )
AND m6_.status IN ( 1 )
AND a7_.id IN (63, 64, 65, 66, 67, 68, 69, 70)
ORDER BY a0_.created_date DESC
LIMIT 60;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=17558.82..17558.97 rows=60 width=12) (actual time=56.594..56.670 rows=60 loops=1)
-> Sort (cost=17558.82..17560.07 rows=498 width=12) (actual time=56.593..56.621 rows=60 loops=1)
Sort Key: a0_.created_date
Sort Method: top-N heapsort Memory: 27kB
-> Nested Loop (cost=0.00..17541.62 rows=498 width=12) (actual time=0.047..53.808 rows=4478 loops=1)
-> Nested Loop (cost=0.00..3903.99 rows=286 width=4) (actual time=0.027..17.492 rows=8004 loops=1)
-> Seq Scan on area a7_ (cost=0.00..144.78 rows=8 width=4) (actual time=0.007..0.823 rows=8 loops=1)
Filter: (id = ANY ('{63,64,65,66,67,68,69,70}'::integer[]))
Rows Removed by Filter: 5081
-> Index Scan using idx_70e4fa78bd0f409c on member m6_ (cost=0.00..468.38 rows=152 width=8) (actual time=0.011..1.208 rows=1000 loops=8)
Index Cond: (area_id = a7_.id)
Filter: (status = 1)
Rows Removed by Filter: 2
-> Index Scan using idx_54f1f40ba76ed395 on advert a0_ (cost=0.00..47.49 rows=19 width=16) (actual time=0.002..0.003 rows=1 loops=8004)
Index Cond: (user_id = m6_.id)
Filter: (status = 1)
Rows Removed by Filter: 1
Total runtime: 56.744 ms
(18 rows)
Time: 57.995 ms
I tried to put the subquery in the INNER JOIN:
EXPLAIN ANALYZE SELECT a0_.id AS id0
FROM advert a0_
INNER JOIN member m6_
ON a0_.user_id = m6_.id
INNER JOIN area a7_
ON m6_.area_id = a7_.id
AND m6_.area_id IN (SELECT area_id FROM network_area_flatdeep WHERE network_id IN (1))
WHERE a0_.status IN ( 1 )
AND m6_.status IN ( 1 )
ORDER BY a0_.created_date DESC
LIMIT 60;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4.37..34966.73 rows=60 width=12) (actual time=2.957..42.443 rows=60 loops=1)
-> Nested Loop (cost=4.37..2311599.10 rows=3967 width=12) (actual time=2.956..42.394 rows=60 loops=1)
-> Nested Loop Semi Join (cost=4.37..2302173.51 rows=3967 width=20) (actual time=2.949..42.099 rows=60 loops=1)
Join Filter: (m6_.area_id = network_area_flatdeep.area_id)
Rows Removed by Join Filter: 22333
-> Nested Loop (cost=0.00..2230853.09 rows=316797 width=16) (actual time=0.028..18.612 rows=2829 loops=1)
-> Index Scan Backward using advert_created_date_idx on advert a0_ (cost=0.00..762000.64 rows=317633 width=16) (actual time=0.012..3.802 rows=2834 loops=1)
Filter: (status = 1)
Rows Removed by Filter: 21
-> Index Scan using member_pkey on member m6_ (cost=0.00..4.61 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=2834)
Index Cond: (id = a0_.user_id)
Filter: (status = 1)
Rows Removed by Filter: 0
-> Materialize (cost=4.37..41.14 rows=15 width=4) (actual time=0.000..0.004 rows=8 loops=2829)
-> Bitmap Heap Scan on network_area_flatdeep (cost=4.37..41.06 rows=15 width=4) (actual time=0.009..0.015 rows=8 loops=1)
Recheck Cond: (network_id = 1)
-> Bitmap Index Scan on idx_c29e880034128b91 (cost=0.00..4.36 rows=15 width=0) (actual time=0.006..0.006 rows=8 loops=1)
Index Cond: (network_id = 1)
-> Index Only Scan using area_pkey on area a7_ (cost=0.00..2.37 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=60)
Index Cond: (id = m6_.area_id)
Heap Fetches: 60
Total runtime: 42.538 ms
(22 rows)
I tried to get rid of the subquery and make a proper JOIN statement to network_area_flatdeep
(I STRONGLY PREFER THIS VERSION):
EXPLAIN ANALYZE SELECT a0_.id AS id0
FROM advert a0_
INNER JOIN member m6_
ON a0_.user_id = m6_.id
INNER JOIN area a7_
ON m6_.area_id = a7_.id
INNER JOIN network_area_flatdeep n14_
ON a7_.id = n14_.area_id
AND ( n14_.network_id IN ( 1 ) )
WHERE a0_.status IN ( 1 )
AND m6_.status IN ( 1 )
ORDER BY a0_.created_date DESC
LIMIT 60;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=30031.18..30031.33 rows=60 width=12) (actual time=62.968..63.045 rows=60 loops=1)
-> Sort (cost=30031.18..30033.51 rows=934 width=12) (actual time=62.967..62.991 rows=60 loops=1)
Sort Key: a0_.created_date
Sort Method: top-N heapsort Memory: 27kB
-> Nested Loop (cost=0.00..29998.92 rows=934 width=12) (actual time=0.157..60.280 rows=4478 loops=1)
-> Nested Loop (cost=0.00..4401.66 rows=536 width=4) (actual time=0.029..20.488 rows=8004 loops=1)
-> Nested Loop (cost=0.00..120.69 rows=15 width=8) (actual time=0.015..0.084 rows=8 loops=1)
-> Index Scan using idx_c29e880034128b91 on network_area_flatdeep n14_ (cost=0.00..60.47 rows=15 width=4) (actual time=0.009..0.019 rows=8 loops=1)
Index Cond: (network_id = 1)
-> Index Only Scan using area_pkey on area a7_ (cost=0.00..4.01 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=8)
Index Cond: (id = n14_.area_id)
Heap Fetches: 8
-> Index Scan using idx_70e4fa78bd0f409c on member m6_ (cost=0.00..283.88 rows=152 width=8) (actual time=0.011..1.278 rows=1000 loops=8)
Index Cond: (area_id = a7_.id)
Filter: (status = 1)
Rows Removed by Filter: 2
-> Index Scan using idx_54f1f40ba76ed395 on advert a0_ (cost=0.00..47.57 rows=19 width=16) (actual time=0.003..0.003 rows=1 loops=8004)
Index Cond: (user_id = m6_.id)
Filter: (status = 1)
Rows Removed by Filter: 1
Total runtime: 63.125 ms
(21 rows)
Changing IN
to =
doesn't helper either.
I tried using EXISTS as suggested by wilderplasser with an answer below.
When I either remove the ORDER BY or the network criteria, the query is fast (2ms). But why can't they play nice together?
Now then... when I change the network_id from '1' to '10'. The query is extremely fast like I want. 10 is a root network that contains ALL areas. So apperently the LESS results the JOIN needs to filter out, the faster the query is.
The structure looks like this:
area (mapped to network 10 which contains 5089 areas in flat table and is FAST)
| |---- area 1 (network 1 which contains 8 areas in flat table and is SLOW)
| |--- more areas
|-- ALOT MORE areas
|-- etc
Changing network 1 to 10 gives: 3.265 ms
To summarize the difference:
network 1 has 8 areas in network_area_flatdeep
network 10 has 5089 areas in network_area_flatdeep
So INNER JOIN using network 1 is very slow, INNER JOIN using network 10 is very fast. Same behavior with subqueries.
Advert table
353804 rows
Table "public.advert"
Column | Type | Modifiers | Storage | Stats target | Description
-----------------------------+--------------------------------+-----------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('advert_id_seq'::regclass) | plain | |
user_id | integer | not null | plain | |
advert_category_id | integer | not null | plain | |
currency_id | integer | not null | plain | |
advert_kind_id | integer | not null | plain | |
advert_price_id | integer | | plain | |
external_source_id | integer | | plain | |
status | integer | not null | plain | |
type | integer | not null | plain | |
title | character varying(60) | not null | extended | |
description | text | not null | extended | |
price | numeric(19,2) | default NULL::numeric | main | |
accepting_bids | boolean | not null | plain | |
promoted | boolean | not null | plain | |
edited_date | timestamp(0) without time zone | default NULL::timestamp without time zone | plain | |
created_date | timestamp(0) without time zone | not null | plain | |
archived_date | timestamp(0) without time zone | default NULL::timestamp without time zone | plain | |
views | integer | not null | plain | |
checked_date | timestamp(0) without time zone | default NULL::timestamp without time zone | plain | |
archived_by_cron | boolean | not null | plain | |
unarchived_by_cron | boolean | not null | plain | |
containting_forbidden_words | boolean | not null | plain | |
external_id | character varying(255) | default NULL::character varying | extended | |
new_product | boolean | not null | plain | |
Indexes:
"advert_pkey" PRIMARY KEY, btree (id)
"advert_external_idx_uq" UNIQUE, btree (external_id, external_source_id)
"advert_archived_date_idx" btree (archived_date)
"advert_checked_date_idx" btree (checked_date)
"advert_created_date_idx" btree (created_date)
"advert_edited_date_idx" btree (edited_date)
"advert_external_id_idx" btree (external_id)
"advert_price_idx" btree (price)
"advert_status_idx" btree (status)
"advert_type_idx" btree (type)
"advert_views_idx" btree (views)
"idx_54f1f40b38248176" btree (currency_id)
"idx_54f1f40b54b67d66" btree (advert_price_id)
"idx_54f1f40b9a2e6cff" btree (advert_kind_id)
"idx_54f1f40ba76ed395" btree (user_id)
"idx_54f1f40bb167b375" btree (external_source_id)
"idx_54f1f40bd4436821" btree (advert_category_id)
Foreign-key constraints:
"fk_54f1f40b38248176" FOREIGN KEY (currency_id) REFERENCES currency(id) ON DELETE RESTRICT
"fk_54f1f40b54b67d66" FOREIGN KEY (advert_price_id) REFERENCES advertprice(id) ON DELETE RESTRICT
"fk_54f1f40b9a2e6cff" FOREIGN KEY (advert_kind_id) REFERENCES advertkind(id) ON DELETE RESTRICT
"fk_54f1f40ba76ed395" FOREIGN KEY (user_id) REFERENCES member(id) ON DELETE CASCADE
"fk_54f1f40bb167b375" FOREIGN KEY (external_source_id) REFERENCES externalsource(id) ON DELETE RESTRICT
"fk_54f1f40bd4436821" FOREIGN KEY (advert_category_id) REFERENCES advertcategory(id) ON DELETE RESTRICT
Referenced by:
TABLE "advert_photo" CONSTRAINT "fk_1c939974d07eccb6" FOREIGN KEY (advert_id) REFERENCES advert(id) ON DELETE CASCADE
TABLE "banner" CONSTRAINT "fk_6f9db8e7d07eccb6" FOREIGN KEY (advert_id) REFERENCES advert(id) ON DELETE SET NULL
TABLE "advertbid" CONSTRAINT "fk_fccdba75d07eccb6" FOREIGN KEY (advert_id) REFERENCES advert(id) ON DELETE CASCADE
Has OIDs: no
Area table:
5089 rows
Table "public.area"
Column | Type | Modifiers | Storage | Stats target | Description
------------+---------+---------------------------------------------------+---------+--------------+-------------
id | integer | not null default nextval('area_id_seq'::regclass) | plain | |
network_id | integer | | plain | |
parent_id | integer | | plain | |
selectable | boolean | not null | plain | |
Indexes:
"area_pkey" PRIMARY KEY, btree (id)
"idx_d7943d6834128b91" btree (network_id)
"idx_d7943d68727aca70" btree (parent_id)
Foreign-key constraints:
"fk_d7943d6834128b91" FOREIGN KEY (network_id) REFERENCES network(id) ON DELETE RESTRICT
"fk_d7943d68727aca70" FOREIGN KEY (parent_id) REFERENCES area(id) ON DELETE CASCADE
Referenced by:
TABLE "network_area_flat" CONSTRAINT "fk_10aae5b2bd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE CASCADE
TABLE "area_language" CONSTRAINT "fk_17d42f7dbd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE CASCADE
TABLE "area_zip_code" CONSTRAINT "fk_62a3bf90bd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE CASCADE
TABLE "member" CONSTRAINT "fk_70e4fa78bd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE RESTRICT
TABLE "network_area_flatdeep" CONSTRAINT "fk_c29e8800bd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE CASCADE
TABLE "area" CONSTRAINT "fk_d7943d68727aca70" FOREIGN KEY (parent_id) REFERENCES area(id) ON DELETE CASCADE
Has OIDs: no
Member table:
182450 rows
Table "public.member"
Column | Type | Modifiers | Storage | Stats target | Description
-----------------------+--------------------------------+-----------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('member_id_seq'::regclass) | plain | |
language_id | integer | not null | plain | |
area_id | integer | not null | plain | |
company_id | integer | | plain | |
external_source_id | integer | | plain | |
email | character varying(255) | not null | extended | |
password | character varying(40) | not null | extended | |
status | integer | not null | plain | |
name | character varying(150) | not null | extended | |
zip_code | character varying(20) | | extended | |
phone_number | character varying(120) | default NULL::character varying | extended | |
using_email_service | boolean | not null | plain | |
edited_date | timestamp(0) without time zone | default NULL::timestamp without time zone | plain | |
created_date | timestamp(0) without time zone | not null | plain | |
hiding_on_own_network | boolean | not null | plain | |
staff | boolean | not null | plain | |
superuser | boolean | not null | plain | |
external_id | character varying(255) | default NULL::character varying | extended | |
last_login_date | timestamp(0) without time zone | default NULL::timestamp without time zone | plain | |
deleted_adverts | integer | not null | plain | |
Indexes:
"member_pkey" PRIMARY KEY, btree (id)
"user_email_idx_uq" UNIQUE, btree (email)
"user_external_idx_uq" UNIQUE, btree (external_id, external_source_id)
"idx_70e4fa7882f1baf4" btree (language_id)
"idx_70e4fa78979b1ad6" btree (company_id)
"idx_70e4fa78b167b375" btree (external_source_id)
"idx_70e4fa78bd0f409c" btree (area_id)
"user_external_id_idx" btree (external_id)
"user_name_idx" btree (name)
"user_status_idx" btree (status)
Foreign-key constraints:
"fk_70e4fa7882f1baf4" FOREIGN KEY (language_id) REFERENCES language(id) ON DELETE RESTRICT
"fk_70e4fa78979b1ad6" FOREIGN KEY (company_id) REFERENCES company(id) ON DELETE SET NULL
"fk_70e4fa78b167b375" FOREIGN KEY (external_source_id) REFERENCES externalsource(id) ON DELETE RESTRICT
"fk_70e4fa78bd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE RESTRICT
Referenced by:
TABLE "user_link" CONSTRAINT "fk_4c2dd538a76ed395" FOREIGN KEY (user_id) REFERENCES member(id) ON DELETE CASCADE
TABLE "advert" CONSTRAINT "fk_54f1f40ba76ed395" FOREIGN KEY (user_id) REFERENCES member(id) ON DELETE CASCADE
TABLE "banner" CONSTRAINT "fk_6f9db8e7a76ed395" FOREIGN KEY (user_id) REFERENCES member(id) ON DELETE SET NULL
TABLE "user_admin_module_permission" CONSTRAINT "fk_74fee7cea76ed395" FOREIGN KEY (user_id) REFERENCES member(id) ON DELETE CASCADE
TABLE "user_admin_resource_permission" CONSTRAINT "fk_c9fcf279a76ed395" FOREIGN KEY (user_id) REFERENCES member(id) ON DELETE CASCADE
Has OIDs: no
Network_area_flatdeep table:
10177 rows
Table "public.network_area_flatdeep"
Column | Type | Modifiers | Storage | Stats target | Description
--------------+--------------------------------+--------------------------------------------------------------------+---------+--------------+-------------
id | integer | not null default nextval('network_area_flatdeep_id_seq'::regclass) | plain | |
network_id | integer | not null | plain | |
area_id | integer | not null | plain | |
created_date | timestamp(0) without time zone | not null | plain | |
Indexes:
"network_area_flatdeep_pkey" PRIMARY KEY, btree (id)
"area_flatdeep_idx_uq" UNIQUE, btree (area_id, network_id, created_date)
"idx_c29e880034128b91" btree (network_id)
"idx_c29e8800bd0f409c" btree (area_id)
Foreign-key constraints:
"fk_c29e880034128b91" FOREIGN KEY (network_id) REFERENCES network(id) ON DELETE CASCADE
"fk_c29e8800bd0f409c" FOREIGN KEY (area_id) REFERENCES area(id) ON DELETE CASCADE
Has OIDs: no
Short server config:
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit
name | current_setting | source
----------------------------+--------------------+----------------------
shared_buffers | 1800MB | configuration file
work_mem | 4MB | configuration file
Conclusion:
The more data (less filtered out), the faster the query is. I'm a bit clueless now. How can PostgreSQL be so slow on this? Again: 40ms doesn't look extremely slow, but with the real query with a big SELECT statement, queries are 1.5s most of the time and can take up to 3s.
All filtering is done on indexes. The ordering is done on an index too.
Does anyone have an idea how to improve this simple filter I clearly need to separate data?
Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus. Photo by Richard Jacobs on Unsplash.
An implementation of join in which one of the collections of rows to be joined is hashed on the join keys using a separate 'Hash' node. Postgres then iterates over the other collection of rows, for each one looking it up in the hash table to see if there are any rows it should be joined to.
The filter clause extends aggregate functions ( sum , avg , count , …) by an additional where clause. The result of the aggregate is built from only the rows that satisfy the additional where clause too.
The PostgreSQL SELECT statement retrieves data from a single or several tables in a database, and returns the data in a result table, called a result-set. Use the SELECT statement to return one or more rows matching the specified criteria from the database tables.
Try to replace the IN(subquery)
by the corresponding EXISTS(correlated subquery)
SELECT a0_.id AS id0
FROM advert a0_
INNER JOIN member m6_
ON a0_.user_id = m6_.id
INNER JOIN area a7_
ON m6_.area_id = a7_.id
WHERE a0_.status IN ( 1 )
AND m6_.status IN ( 1 )
AND EXISTS (
SELECT*
FROM network_area_flatdeep xx
WHERE xx.area_id = m6_.area_id
AND xx.network_id IN (2)
)
ORDER BY a0_.created_date DESC
LIMIT 60
;
An you do need of course primary keys and relavant foreign keys. I don'rt know about network_area_flatdeep.area_id
, if it is not a PK or FK, you might need an index on it (or make it part of a composite PK)
UPDATE: a testbed with synthetic data:
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE Table area
( id SERIAL not null PRIMARY KEY
, zzzz varchar
);
CREATE Table member
( id SERIAL not null PRIMARY KEY
, language_id integer not null DEFAULT 0
, area_id integer not null REFERENCES area(id)
, company_id integer
, external_source_id integer
, status integer not null DEFAULT 0
, name varchar not null
, zip_code varchar not null
, phone_number varchar default NULL
, using_email_service boolean not null DEFAULT False
, edited_date timestamp(0) without time zone default NULL
, created_date timestamp(0) without time zone not null
, hiding_on_own_network boolean not null DEFAULT False
, staff boolean not null DEFAULT False
, superuser boolean not null DEFAULT False
, external_id varchar default NULL
);
CREATE TABLE advert
( id SERIAL NOT NULL PRIMARY KEY
, user_id integer NOT NULL REFERENCES member(id)
, advert_category_id integer NOT NULL DEFAULT 0
, currency_id integer NOT NULL DEFAULT 0
, advert_kind_id integer NOT NULL DEFAULT 0
, advert_price_id integer
, external_source_id integer
, status integer NOT NULL DEFAULT 0
, type integer NOT NULL DEFAULT 0
, title varchar NOT NULL
, description text NOT NULL
, price numeric(10,2) default NULL
, accepting_bids boolean NOT NULL DEFAULT False
, promoted boolean NOT NULL DEFAULT False
, edited_date timestamp(0) without time zone default NULL
, created_date timestamp(0) without time zone NOT NULL
, archived_date timestamp(0) without time zone default NULL
, views integer NOT NULL DEFAULT 0
, checked_date timestamp(0) without time zone default NULL
, archived_by_cron boolean NOT NULL DEFAULT False
, unarchived_by_cron boolean NOT NULL DEFAULT False
, containting_forbidden_words boolean NOT NULL DEFAULT False
, external_id varchar default NULL
);
CREATE INDEX advert_created_date_idx ON advert (created_date);
CREATE Table network_area_flatdeep
-- the surrogate key in a junction table is questionable
( id SERIAL not null PRIMARY KEY
, network_id integer not null -- REFERENCES network(id) ON DELETE CASCADE
, area_id integer not null REFERENCES area(id) ON DELETE CASCADE
, created_date timestamp(0) without time zone not null
-- the date in the below constraint is questionable
, CONSTRAINT area_flatdeep_idx_uq UNIQUE (area_id, network_id, created_date)
);
CREATE INDEX idx_c29e880034128b91 ON network_area_flatdeep(network_id);
CREATE INDEX idx_c29e8800bd0f409c ON network_area_flatdeep(area_id);
INSERT INTO area ( zzzz)
SELECT 'Zzzz_' || gs::text
FROM generate_series(1,39) gs
;
INSERT INTO network_area_flatdeep
-- the surrogate key in a junction table is questionable
( network_id , area_id , created_date)
SELECT gs % 7 , aa.id, now()
FROM generate_series(1,76) gs
JOIN area aa ON aa.id = 1+gs % 39
;
INSERT INTO member
( area_id , name , zip_code, created_date)
SELECT aa.id
, 'Member_'|| gs::text
, 'Code_'|| gs::text
, now()
FROM generate_series(1, 10086) gs
JOIN area aa ON aa.id = 1 + gs % 39
;
INSERT INTO advert( user_id , title, description, edited_date , created_date)
SELECT 1+ (gs* 321) % 10086
, 'Tit_'|| gs::text
, 'Desc_'|| gs::text
, now()
, now() - (random() * 10000 * '1 sec'::interval)
from generate_series(1,47569 ) gs
;
UPDATE member SET status = 1 WHERE random() < .3;
UPDATE advert SET status = 1 WHERE random() < .3;
VACUUM ANALYZE member;
VACUUM ANALYZE advert;
VACUUM ANALYZE area;
VACUUM ANALYZE network_area_flatdeep;
My query executes in 12ms (PG-9.1), with the same plan as the OPs. (so this must be a configuration issue)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With