Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I disable bitmapscan on PostgreSQL when using SSD?

Tags:

sql

postgresql

I was reading the https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys when I came across these lines:

The PostgreSQL database uses two operations in this case: a Bitmap Index Scan followed by a Bitmap Heap Scan. They roughly correspond to Oracle's INDEX RANGE SCAN and TABLE ACCESS BY INDEX ROWID with one important difference: it first fetches all results from the index (Bitmap Index Scan), then sorts the rows according to the physical storage location of the rows in the heap table and than fetches all rows from the table (Bitmap Heap Scan). This method reduces the number of random access IOs on the table.

It occurred to me that this makes no sense when we are using Postgres on SSD. The calculation of sorting storage location may be a wast. Because SSDs are random-access only devices (if I didn’t get it wrong.)

And I did some test also, by turning on/off the enable_bitmapscan

set enable_bitmapscan to on;
explain analyse select count(distinct myid) from experiment.mytable where name='my_name';
----
QUERY PLAN
Aggregate  (cost=63196.06..63196.07 rows=1 width=8) (actual time=668.845..668.846 rows=1 loops=1)
  ->  Bitmap Heap Scan on mytable  (cost=696.41..63110.95 rows=34045 width=82) (actual time=54.967..216.382 rows=178705 loops=1)
        Recheck Cond: (name = 'my_name'::text)
        Heap Blocks: exact=164942
        ->  Bitmap Index Scan on mytable_name_visittime_idx  (cost=0.00..687.89 rows=34045 width=0) (actual time=28.365..28.365 rows=178705 loops=1)
              Index Cond: (name = 'my_name'::text)
Planning time: 1.411 ms
Execution time: 669.576 ms



set enable_bitmapscan to off;
explain analyse select count(distinct myid) from experiment.mytable where name='my_name';
----
QUERY PLAN
Aggregate  (cost=68369.46..68369.47 rows=1 width=8) (actual time=585.496..585.497 rows=1 loops=1)
  ->  Index Scan using mytable_name_visittime_idx on mytable  (cost=0.56..68284.34 rows=34045 width=82) (actual time=0.019..126.553 rows=178705 loops=1)
        Index Cond: (name = 'my_name'::text)
Planning time: 0.062 ms
Execution time: 585.542 ms

There is indeed a noticeable improvement When enable_bitmapscan the planner use the BitmapHeapScan + BitmapIndexScan. When disable it the planner choose the IndexScan only.

like image 349
kehao Avatar asked Jul 30 '18 06:07

kehao


People also ask

Is bitmap heap scan bad?

Bitmap Heap Scans aren't inherently bad; in fact, they include a built-in optimization to only fetch from disk once we know what we need which can avoid unnecessary duplicate fetches. Fetching rows from disk to satisfy multiple index usage.

What is random page cost in postgresql?

random_page_cost ( floating point ) Sets the planner's estimate of the cost of a non-sequentially-fetched disk page. The default is 4.0. This value can be overridden for tables and indexes in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE).

What is bitmap index scan?

You can think of a bitmap index scan as a middle ground between a sequential scan and an index scan. Like an index scan, it scans an index to determine exactly what data it needs to fetch, but like a sequential scan, it takes advantage of data being easier to read in bulk.


1 Answers

You can also tune the config to let PostgreSQL decide whether random IO cost will be more than the sequential cost.

Change this setting - random_page_cost in postgresql.conf to 1.0, which is equivalent to seq_page_cost.

This will tell PostgreSQL that the cost of random IO is equivalent to the cost of sequential IO.

like image 187
Anmol Avatar answered Oct 04 '22 22:10

Anmol