I have the following index:
CREATE INDEX index_c_profiles_on_city_state_name_domain ON
c_profiles ((data->>'state'), (data->>'city'), name, domain);
and I am using the following query:
SELECT mm.name, mm.domain, mm.data ->> 'city' as city, mm.data ->>
'state' as state
FROM c_profiles as mm
WHERE ((mm.data ->> 'state') = 'AZ')
but when I test this using EXPLAIN ANALYZE, it is always doing a Bitmap Index Scan (Good and fast) followed by a Bitmap Heap Scan which is very very slow (Usually 100x slower than the index scan alone).
I have also tried to just index the WHERE condition, and the results are the same, it still is doing the very slow Bitmap Heap Scan after it uses the index.
Why is Postgres doing this? How can I have it just do the Index scan to make this query fast?
Here is a sample EXPLAIN ANALYZE result:
[
{
"Execution Time": 53.655,
"Planning Time": 0.081,
"Plan": {
"Exact Heap Blocks": 1338,
"Node Type": "Bitmap Heap Scan",
"Actual Total Time": 53.031,
"Shared Hit Blocks": 727,
"Schema": "public",
"Plans": [
{
"Node Type": "Bitmap Index Scan",
"Actual Total Time": 0.455,
"Shared Hit Blocks": 2,
"Shared Read Blocks": 13,
"Temp Written Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Hit Blocks": 0,
"Plan Width": 0,
"Actual Loops": 1,
"Actual Startup Time": 0.455,
"Temp Read Blocks": 0,
"Local Read Blocks": 0,
"Index Name": "index_mattermark_profiles_on_city_state_name_domain",
"Startup Cost": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Written Blocks": 0,
"Plan Rows": 788,
"Index Cond": "((mm.data ->> 'state'::text) = 'AZ'::text)",
"Actual Rows": 1417,
"Parent Relationship": "Outer",
"Total Cost": 34.33
}
],
"Shared Read Blocks": 650,
"Relation Name": "mattermark_profiles",
"Local Hit Blocks": 0,
"Local Dirtied Blocks": 0,
"Temp Written Blocks": 0,
"Plan Width": 1010,
"Actual Loops": 1,
"Rows Removed by Index Recheck": 0,
"Lossy Heap Blocks": 0,
"Alias": "mm",
"Recheck Cond": "((mm.data ->> 'state'::text) = 'AZ'::text)",
"Temp Read Blocks": 0,
"Output": [
"name",
"domain",
"(data ->> 'city'::text)",
"(data ->> 'state'::text)"
],
"Actual Startup Time": 0.703,
"Local Read Blocks": 0,
"Startup Cost": 34.53,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Written Blocks": 0,
"Plan Rows": 788,
"Actual Rows": 1417,
"Total Cost": 2894.17
},
"Triggers": []
}
]
PostgreSQL chooses a bitmap index scan rather than a normal index scan when it thinks that it will be faster.
This is normally the case when the estimated result row count is high.
A normal index scan would have to visit the table for each index entry found, which causes a lot of random I/O on the table and might require the same block to be processed several times.
A bitmap index scan works by first finding all index entries, sorting them in the order of their physical location in the table and then scanning the required blocks from the table. That is more efficient, because it will scan the table blocks sequentially.
The second step of that, the bitmap heap scan, appears as its own node in the EXPLAIN output and is usually the more expensive step.
So everything seems in order.
You can try to set enable_bitmapscan to off and see if PostgreSQL is right and the resulting plan would be more expensive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With