I am trying to test efficiency of different partitioning schemas in PostgreSQL. I created several databases with one table and different partition sizes and then populated hundreds of million records in each.
Now I would like to test some queries on all of them and compare results. But the issue is that I received different actual time on one database when I run same query.
For example, I executed EXPLAIN (buffers, analyze, verbose) SELECT count(*) FROM testdata WHERE dt = '2017-06-01 08:50:00'
first time:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=356199.96..356199.97 rows=1 width=8) (actual time=155757.190..155757.190 rows=1 loops=1)
Output: count(*)
Buffers: shared read=290555
-> Append (cost=0.00..354506.51 rows=677381 width=0) (actual time=165.565..155583.815 rows=665275 loops=1)
Buffers: shared read=290555
-> Seq Scan on public.testdata (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (testdata.dt = '2017-06-01 08:50:00'::timestamp without time zone)
-> Bitmap Heap Scan on public.testdata_2017_06_01 (cost=12682.26..354506.51 rows=677380 width=0) (actual time=165.562..155319.677 rows=665275 loops=1)
Recheck Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Rows Removed by Index Recheck: 31006916
Heap Blocks: exact=25283 lossy=263451
Buffers: shared read=290555
-> Bitmap Index Scan on testdata_2017_06_01_dt_idx (cost=0.00..12512.91 rows=677380 width=0) (actual time=159.597..159.597 rows=665275 loops=1)
Index Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Buffers: shared read=1821
Planning time: 75.419 ms
Execution time: 155757.564 ms
(17 rows)
and second time:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=356199.96..356199.97 rows=1 width=8) (actual time=18871.164..18871.164 rows=1 loops=1)
Output: count(*)
Buffers: shared read=290555
-> Append (cost=0.00..354506.51 rows=677381 width=0) (actual time=148.208..18731.027 rows=665275 loops=1)
Buffers: shared read=290555
-> Seq Scan on public.testdata (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (testdata.dt = '2017-06-01 08:50:00'::timestamp without time zone)
-> Bitmap Heap Scan on public.testdata_2017_06_01 (cost=12682.26..354506.51 rows=677380 width=0) (actual time=148.204..18489.171 rows=665275 loops=1)
Recheck Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Rows Removed by Index Recheck: 31006916
Heap Blocks: exact=25283 lossy=263451
Buffers: shared read=290555
-> Bitmap Index Scan on testdata_2017_06_01_dt_idx (cost=0.00..12512.91 rows=677380 width=0) (actual time=143.119..143.119 rows=665275 loops=1)
Index Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
Buffers: shared read=1821
Planning time: 2.105 ms
Execution time: 18871.482 ms
(17 rows)
The difference is huge (155sec versus 18sec) and my first though was that the result was cached after the first time. But explain above does not display hit
values in buffers
section. Also estimated cost of both executions are same. Why than actual time is different? Is there any service work such creating a query plan which takes such significant time?
If I try to execute same query for another date the situation repeats. In case if PostgreSQL caches query plan, will it be re-generated with every new parameter value? It does not make sense for me because it works like query results caching in this case.
A side question: what is the better way to get pure result of query execution time to compare different partitioning schemas? In MySQL I could use SELECT SQL_NO_CACHE
to disable any caching in query. In PostgreSQL there is DISCARD ALL
I think, but it does not help me with explains above.
Update #1:
Definition of main testdata table:
=> \d+ testdata;
Table "public.testdata"
Column | Type | Modifiers | Storage | Stats target | Description
------------+-----------------------------+---------------+----------+--------------+-------------
param1id | integer | not null | plain | |
param2id | integer | | plain | |
param3id | integer | | plain | |
devid | integer | | plain | |
nameid | integer | | plain | |
dt | timestamp without time zone | default now() | plain | |
value | double precision | | plain | |
svalue | character varying(255) | | extended | |
Triggers:
testdata_partition_insert_trigger BEFORE INSERT ON testdata FOR EACH ROW EXECUTE PROCEDURE create_partition_and_insert()
Child tables: testdata_2017_06_01,
testdata_2017_06_02,
... other child tables, one per day ...
testdata_2017_06_30
Definition of child testdata_2017_06_01:
\d+ testdata_2017_06_01;
Table "public.testdata_2017_06_01"
Column | Type | Modifiers | Storage | Stats target | Description
------------+-----------------------------+---------------+----------+--------------+-------------
param1id | integer | not null | plain | |
param2id | integer | | plain | |
param3id | integer | | plain | |
devid | integer | | plain | |
nameid | integer | | plain | |
dt | timestamp without time zone | default now() | plain | |
value | double precision | | plain | |
svalue | character varying(255) | | extended | |
Indexes:
"testdata_2017_06_01_devid_idx" btree (devid)
"testdata_2017_06_01_dt_idx" btree (dt)
"testdata_2017_06_01_nameid_idx" btree (nameid)
Check constraints:
"partition_check" CHECK (dt >= '2017-06-01 00:00:00'::timestamp without time zone AND dt <= '2017-06-01 23:59:59'::timestamp without time zone)
Inherits: testdata
Update #2:
After turning on track_io_timing
following @rvs advice I rebooted the instance and the picture changed significantly. First, freeable memory was increased from 6GB to almost 8GB - screenshot . Second, planner started use Index Only Scan
instead of Bitmap Heap Scan
. Then, all queries take much less time now (about 2.4 ms for the first query and 0.041 ms when it hits the cache. against intial 155 and 18 seconds). And last, it started hit the cache (in examples above all explains displayed only read buffers).
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=4.59..4.60 rows=1 width=8) (actual time=0.014..0.014 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=4
-> Append (cost=0.00..4.58 rows=2 width=0) (actual time=0.012..0.012 rows=0 loops=1)
Buffers: shared hit=4
-> Seq Scan on public.testdata (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (testdata.dt = '2017-06-15 18:10:10'::timestamp without time zone)
-> Index Only Scan using testdata_2017_06_15_dt_idx on public.testdata_2017_06_15 (cost=0.56..4.58 rows=1 width=0) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: (testdata_2017_06_15.dt = '2017-06-15 18:10:10'::timestamp without time zone)
Heap Fetches: 0
Buffers: shared hit=4
Planning time: 0.888 ms
Execution time: 0.044 ms
(13 rows)
Did huge number of data inserts overflowed all available RAM (8GB)? We inserted one partition with 40kk records and then copied it to another partitions. Should we expect something like this on production (we are going to insert about 1k records per second with bulk inserts)? It is quite frightening, if we need to restart database instance time to time.
Thank you for helping!
What's the amount of data and how much RAM do you have?
Most likely your subsequent queries hitting the cache (PostgreSQL will use OS-level cache as well). Try with set track_io_timing = true
in your session and see what i/o timings Postgres reports for you.
Estimations are likely off due to incorrect setting of effective_cache_size
.
To answer your side question: do not seek for 'pure' time. Try to replicate real scenario. Will this query run repeatedly? Will there be other queries run in parallel? Always run everything multiple times. Cache sizing is important part of performance tuning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With