I was reading the https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys when I came across these lines:
The PostgreSQL database uses two operations in this case: a Bitmap Index Scan followed by a Bitmap Heap Scan. They roughly correspond to Oracle's INDEX RANGE SCAN and TABLE ACCESS BY INDEX ROWID with one important difference: it first fetches all results from the index (Bitmap Index Scan), then sorts the rows according to the physical storage location of the rows in the heap table and than fetches all rows from the table (Bitmap Heap Scan). This method reduces the number of random access IOs on the table.
It occurred to me that this makes no sense when we are using Postgres on SSD. The calculation of sorting storage location may be a wast. Because SSDs are random-access only devices (if I didn’t get it wrong.)
And I did some test also, by turning on/off the enable_bitmapscan
set enable_bitmapscan to on;
explain analyse select count(distinct myid) from experiment.mytable where name='my_name';
----
QUERY PLAN
Aggregate (cost=63196.06..63196.07 rows=1 width=8) (actual time=668.845..668.846 rows=1 loops=1)
-> Bitmap Heap Scan on mytable (cost=696.41..63110.95 rows=34045 width=82) (actual time=54.967..216.382 rows=178705 loops=1)
Recheck Cond: (name = 'my_name'::text)
Heap Blocks: exact=164942
-> Bitmap Index Scan on mytable_name_visittime_idx (cost=0.00..687.89 rows=34045 width=0) (actual time=28.365..28.365 rows=178705 loops=1)
Index Cond: (name = 'my_name'::text)
Planning time: 1.411 ms
Execution time: 669.576 ms
set enable_bitmapscan to off;
explain analyse select count(distinct myid) from experiment.mytable where name='my_name';
----
QUERY PLAN
Aggregate (cost=68369.46..68369.47 rows=1 width=8) (actual time=585.496..585.497 rows=1 loops=1)
-> Index Scan using mytable_name_visittime_idx on mytable (cost=0.56..68284.34 rows=34045 width=82) (actual time=0.019..126.553 rows=178705 loops=1)
Index Cond: (name = 'my_name'::text)
Planning time: 0.062 ms
Execution time: 585.542 ms
There is indeed a noticeable improvement When enable_bitmapscan the planner use the BitmapHeapScan + BitmapIndexScan. When disable it the planner choose the IndexScan only.
Bitmap Heap Scans aren't inherently bad; in fact, they include a built-in optimization to only fetch from disk once we know what we need which can avoid unnecessary duplicate fetches. Fetching rows from disk to satisfy multiple index usage.
random_page_cost ( floating point ) Sets the planner's estimate of the cost of a non-sequentially-fetched disk page. The default is 4.0. This value can be overridden for tables and indexes in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE).
You can think of a bitmap index scan as a middle ground between a sequential scan and an index scan. Like an index scan, it scans an index to determine exactly what data it needs to fetch, but like a sequential scan, it takes advantage of data being easier to read in bulk.
You can also tune the config to let PostgreSQL decide whether random IO cost will be more than the sequential cost.
Change this setting - random_page_cost in postgresql.conf
to 1.0, which is equivalent to seq_page_cost
.
This will tell PostgreSQL that the cost of random IO is equivalent to the cost of sequential IO.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With