The table in question has a B-tree index on time
testdb=> explain analyze select avg(gl) from cdstest where time between 1407700790 and 1407711590;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1434716.75..1434716.76 rows=1 width=2) (actual time=20106.951..20106.952 rows=1 loops=1)
-> Bitmap Heap Scan on cdstest (cost=231261.49..1411280.42 rows=9374529 width=2) (actual time=811.495..10871.963 rows=9438824 loops=1)
Recheck Cond: (("time" >= 1407700790) AND ("time" <= 1407711590))
Rows Removed by Index Recheck: 204734
-> Bitmap Index Scan on timeindex (cost=0.00..228917.86 rows=9374529 width=0) (actual time=810.108..810.108 rows=9438824 loops=1)
Index Cond: (("time" >= 1407700790) AND ("time" <= 1407711590))
Total runtime: 20107.001 ms
(7 rows)
Rows Removed by Index Recheck: 204734
- What does this mean? This seems like a fairly arbitrary number.
Number of rows between the given time range:
testdb=> select count(*) from cdstest where time between 1407700790 and 1407711590;
count
---------
9438824
(1 row)
The table contains ~60million rows.
The inner Bitmap Index Scan
node is producing a bitmap, putting 1
to all the places where records that match your search key are found, and 0
otherwise. As your table is quite big, the size of the bitmap is getting bigger, then available memory for these kind of operations, configured via work_mem
, becomes small to keep the whole bitmap.
When in lack of a memory, inner node will start producing 1
not for records, but rather for blocks that are known to contain matching records. This means, that outer node Bitmap Heap Scan
has to read all records from such block and re-check them. Obiously, there'll be some non-matching ones, and their number is what you see as Rows Removed by Index Recheck
.
In the soon coming 9.4 a new feature is added, reporting how many exact
and/or lossy
pages where returned by the Bitmap Index Scan
node. lossy
are the ones you'd like to avoid. You can check more about this here.
Finally, consult your work_mem
setting and try increasing it, just for this particular session.
I assume, that increasing by some 40% should be enough.
EDIT
I have 9.4beta3 running here, so I prepared a small show case:
DROP TABLE IF EXISTS tab;
SELECT id, id%10 mod
INTO tab
FROM generate_series(1,(1e7)::int) id;
CREATE INDEX i_tab_mod ON tab(mod);
VACUUM ANALYZE tab;
Now, I set work_mem
to the minimal possible value and check it:
SET work_mem TO '64kB';
EXPLAIN (analyze, buffers)
SELECT * FROM tab WHERE mod=5;
EXPLAIN
provides the following 2 rows:
Rows Removed by Index Recheck: 8896308
Heap Blocks: exact=510 lossy=43738
...
Execution time: 1356.938 ms
Which means, that 64kB
can hold 510 exact blocks. So I calculate the total memory requirement here:
new_mem_in_bytes = (work_mem_in_bytes / exact) * lossy
= (( 64.0 * 1024 / 510 ) * 43738) / 1024
= 5488.7kB
This is not precise approach to calculate needed memory, in fact, but I think it is good enough for our needs. So I tried with SET work_mem TO '5MB'
:
Heap Blocks: exact=44248
...
Execution time: 283.466 ms
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With