Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL Bitmap Heap Scan on index is very slow but Index Only Scan is fast

I create a table with 43kk rows, populate them with values 1..200. So ~220k per each number spreaded through the table.

create table foo (id integer primary key, val bigint);
insert into foo
  select i, random() * 200 from generate_series(1, 43000000) as i;
create index val_index on foo(val);
vacuum analyze foo;
explain analyze select id from foo where val = 55;

Result: http://explain.depesz.com/s/fdsm

I expect total runtime < 1s, is it possible? I have SSD, core i5 (1,8), 4gb RAM. 9,3 Postgres.

If I use Index Only scan it works very fast:

explain analyze select val from foo where val = 55;

http://explain.depesz.com/s/7hm

But I need to select id not val so Incex Only scan is not suitable in my case.

Thanks in advance!

Additional info:

SELECT relname, relpages, reltuples::numeric, pg_size_pretty(pg_table_size(oid)) 
FROM pg_class WHERE oid='foo'::regclass;

Result:

"foo";236758;43800000;"1850 MB"

Config:

"cpu_index_tuple_cost";"0.005";""
"cpu_operator_cost";"0.0025";""
"cpu_tuple_cost";"0.01";""
"effective_cache_size";"16384";"8kB"
"max_connections";"100";""
"max_stack_depth";"2048";"kB"
"random_page_cost";"4";""
"seq_page_cost";"1";""
"shared_buffers";"16384";"8kB"
"temp_buffers";"1024";"8kB"
"work_mem";"204800";"kB"
like image 870
fasth Avatar asked Oct 31 '14 02:10

fasth


People also ask

Does PostgreSQL support bitmap indexes?

PostgreSQL doesn't support BITMAP index. You can use BRIN index in some cases.

What is bitmap heap scan in Postgres?

A bitmap heap scan takes a row location bitmap generated by a Bitmap Index Scan (either directly, or through a series of bitmap set operations via BitmapAnd and BitmapOr nodes) and looks up the relevant data.

What is parallel bitmap heap scan?

Blocks are handed out one at a time, so that access to the table remains sequential. In a parallel bitmap heap scan, one process is chosen as the leader. That process performs a scan of one or more indexes and builds a bitmap indicating which table blocks need to be visited.

What is index only scan?

An index-only scan, after finding a candidate index entry, checks the visibility map bit for the corresponding heap page. If it's set, the row is known visible and so the data can be returned with no further work.


2 Answers

I have got answer here: http://ask.use-the-index-luke.com/questions/235/postgresql-bitmap-heap-scan-on-index-is-very-slow-but-index-only-scan-is-fast

The trick is to use composite index for id and value:

create index val_id_index on foo(val, id);

So Index Only scan will be used, but I can select id now.

select id from foo where val = 55;

Result:

http://explain.depesz.com/s/nDt3

But this works ONLY in Postgres with version 9.2+. If you have forced to use versions below try another options.

like image 138
fasth Avatar answered Oct 21 '22 19:10

fasth


Although you're querying only 0,5% of the table, or ~10MB worth of data (out of nearly 2GB table), values of interest are spread evenly across whole table.

You can see it in the first plan you've provided:

  • BitmapIndexScan completes in 123.172ms
  • BitmapHeapScan takes 17055.046ms.

You can try clustering your tables based on index order, which will put rows together on the same pages. On my SATA disks I have the following:

SET work_mem TO '300MB';
EXPLAIN (analyze,buffers) SELECT id FROM foo WHERE val = 55;

  Bitmap Heap Scan on foo  (...) (actual time=90.315..35091.665 rows=215022 loops=1)
    Heap Blocks: exact=140489
    Buffers: shared hit=20775 read=120306 written=24124

SET maintenance_work_mem TO '1GB';
CLUSTER foo USING val_index;
EXPLAIN (analyze,buffers) SELECT id FROM foo WHERE val = 55;

  Bitmap Heap Scan on foo  (...) (actual time=49.215..407.505 rows=215022 loops=1)
    Heap Blocks: exact=1163
    Buffers: shared read=1755

Of course, this is a one-time operation and it'll get longer bit-by-bit over the time.

like image 5
vyegorov Avatar answered Oct 21 '22 18:10

vyegorov