Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why actual time in EXPLAIN result is different with every execution in PostgreSQL?

I am trying to test efficiency of different partitioning schemas in PostgreSQL. I created several databases with one table and different partition sizes and then populated hundreds of million records in each.

Now I would like to test some queries on all of them and compare results. But the issue is that I received different actual time on one database when I run same query.

For example, I executed EXPLAIN (buffers, analyze, verbose) SELECT count(*) FROM testdata WHERE dt = '2017-06-01 08:50:00' first time:

                                                                         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=356199.96..356199.97 rows=1 width=8) (actual time=155757.190..155757.190 rows=1 loops=1)
  Output: count(*)
  Buffers: shared read=290555
  ->  Append  (cost=0.00..354506.51 rows=677381 width=0) (actual time=165.565..155583.815 rows=665275 loops=1)
        Buffers: shared read=290555
        ->  Seq Scan on public.testdata  (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
              Filter: (testdata.dt = '2017-06-01 08:50:00'::timestamp without time zone)
        ->  Bitmap Heap Scan on public.testdata_2017_06_01  (cost=12682.26..354506.51 rows=677380 width=0) (actual time=165.562..155319.677 rows=665275 loops=1)
              Recheck Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
              Rows Removed by Index Recheck: 31006916
              Heap Blocks: exact=25283 lossy=263451
              Buffers: shared read=290555
              ->  Bitmap Index Scan on testdata_2017_06_01_dt_idx  (cost=0.00..12512.91 rows=677380 width=0) (actual time=159.597..159.597 rows=665275 loops=1)
                    Index Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
                    Buffers: shared read=1821
Planning time: 75.419 ms
Execution time: 155757.564 ms
(17 rows)

and second time:

                                                                           QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=356199.96..356199.97 rows=1 width=8) (actual time=18871.164..18871.164 rows=1 loops=1)
  Output: count(*)
  Buffers: shared read=290555
  ->  Append  (cost=0.00..354506.51 rows=677381 width=0) (actual time=148.208..18731.027 rows=665275 loops=1)
        Buffers: shared read=290555
        ->  Seq Scan on public.testdata  (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
              Filter: (testdata.dt = '2017-06-01 08:50:00'::timestamp without time zone)
        ->  Bitmap Heap Scan on public.testdata_2017_06_01  (cost=12682.26..354506.51 rows=677380 width=0) (actual time=148.204..18489.171 rows=665275 loops=1)
              Recheck Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
              Rows Removed by Index Recheck: 31006916
              Heap Blocks: exact=25283 lossy=263451
              Buffers: shared read=290555
              ->  Bitmap Index Scan on testdata_2017_06_01_dt_idx  (cost=0.00..12512.91 rows=677380 width=0) (actual time=143.119..143.119 rows=665275 loops=1)
                    Index Cond: (testdata_2017_06_01.dt = '2017-06-01 08:50:00'::timestamp without time zone)
                    Buffers: shared read=1821
Planning time: 2.105 ms
Execution time: 18871.482 ms
(17 rows)

The difference is huge (155sec versus 18sec) and my first though was that the result was cached after the first time. But explain above does not display hit values in buffers section. Also estimated cost of both executions are same. Why than actual time is different? Is there any service work such creating a query plan which takes such significant time?

If I try to execute same query for another date the situation repeats. In case if PostgreSQL caches query plan, will it be re-generated with every new parameter value? It does not make sense for me because it works like query results caching in this case.

A side question: what is the better way to get pure result of query execution time to compare different partitioning schemas? In MySQL I could use SELECT SQL_NO_CACHE to disable any caching in query. In PostgreSQL there is DISCARD ALL I think, but it does not help me with explains above.

Update #1:

Definition of main testdata table:

=> \d+ testdata;
                                    Table "public.testdata"
   Column   |            Type             |   Modifiers   | Storage  | Stats target | Description
------------+-----------------------------+---------------+----------+--------------+-------------
 param1id   | integer                     | not null      | plain    |              |
 param2id   | integer                     |               | plain    |              |
 param3id   | integer                     |               | plain    |              |
 devid      | integer                     |               | plain    |              |
 nameid     | integer                     |               | plain    |              |
 dt         | timestamp without time zone | default now() | plain    |              |
 value      | double precision            |               | plain    |              |
 svalue     | character varying(255)      |               | extended |              |
Triggers:
    testdata_partition_insert_trigger BEFORE INSERT ON testdata FOR EACH ROW EXECUTE PROCEDURE create_partition_and_insert()
Child tables: testdata_2017_06_01,
              testdata_2017_06_02,
              ... other child tables, one per day ...
              testdata_2017_06_30

Definition of child testdata_2017_06_01:

\d+ testdata_2017_06_01;
                               Table "public.testdata_2017_06_01"
   Column   |            Type             |   Modifiers   | Storage  | Stats target | Description
------------+-----------------------------+---------------+----------+--------------+-------------
 param1id   | integer                     | not null      | plain    |              |
 param2id   | integer                     |               | plain    |              |
 param3id   | integer                     |               | plain    |              |
 devid      | integer                     |               | plain    |              |
 nameid     | integer                     |               | plain    |              |
 dt         | timestamp without time zone | default now() | plain    |              |
 value      | double precision            |               | plain    |              |
 svalue     | character varying(255)      |               | extended |              |
Indexes:
    "testdata_2017_06_01_devid_idx" btree (devid)
    "testdata_2017_06_01_dt_idx" btree (dt)
    "testdata_2017_06_01_nameid_idx" btree (nameid)
Check constraints:
    "partition_check" CHECK (dt >= '2017-06-01 00:00:00'::timestamp without time zone AND dt <= '2017-06-01 23:59:59'::timestamp without time zone)
Inherits: testdata

Update #2:

After turning on track_io_timing following @rvs advice I rebooted the instance and the picture changed significantly. First, freeable memory was increased from 6GB to almost 8GB - screenshot . Second, planner started use Index Only Scan instead of Bitmap Heap Scan. Then, all queries take much less time now (about 2.4 ms for the first query and 0.041 ms when it hits the cache. against intial 155 and 18 seconds). And last, it started hit the cache (in examples above all explains displayed only read buffers).

                                                                               QUERY PLAN
 ----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.59..4.60 rows=1 width=8) (actual time=0.014..0.014 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=4
   ->  Append  (cost=0.00..4.58 rows=2 width=0) (actual time=0.012..0.012 rows=0 loops=1)
         Buffers: shared hit=4
         ->  Seq Scan on public.testdata  (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
               Filter: (testdata.dt = '2017-06-15 18:10:10'::timestamp without time zone)
         ->  Index Only Scan using testdata_2017_06_15_dt_idx on public.testdata_2017_06_15  (cost=0.56..4.58 rows=1 width=0) (actual time=0.009..0.009 rows=0 loops=1)
               Index Cond: (testdata_2017_06_15.dt = '2017-06-15 18:10:10'::timestamp without time zone)
               Heap Fetches: 0
               Buffers: shared hit=4
 Planning time: 0.888 ms
 Execution time: 0.044 ms
(13 rows)

Did huge number of data inserts overflowed all available RAM (8GB)? We inserted one partition with 40kk records and then copied it to another partitions. Should we expect something like this on production (we are going to insert about 1k records per second with bulk inserts)? It is quite frightening, if we need to restart database instance time to time.

Thank you for helping!

like image 401
dmkov Avatar asked Oct 29 '22 05:10

dmkov


1 Answers

What's the amount of data and how much RAM do you have?

Most likely your subsequent queries hitting the cache (PostgreSQL will use OS-level cache as well). Try with set track_io_timing = true in your session and see what i/o timings Postgres reports for you.

Estimations are likely off due to incorrect setting of effective_cache_size.

To answer your side question: do not seek for 'pure' time. Try to replicate real scenario. Will this query run repeatedly? Will there be other queries run in parallel? Always run everything multiple times. Cache sizing is important part of performance tuning.

like image 114
rvs Avatar answered Nov 15 '22 07:11

rvs