Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HBase: Why are there evicted blocks before the max size of the BlockCache is reached?

I am currently using a stock configuration of Apache HBase, with RegionServer heap at 4G and BlockCache sizing at 40%, so around 1.6G. No L2/BucketCache configured.

Here are the BlockCache metrics after ~2K requests to RegionServer. As you can see, there were blocks evicted already, probably leading to some of the misses.

Why were they evicted when we aren't even close to the limit?

Size 2.1 M Current size of block cache in use (bytes)

Free 1.5 G The total free memory currently available to store more cache entries (bytes)

Count 18 Number of blocks in block cache

Evicted 14 The total number of blocks evicted

Evictions 1,645 The total number of times an eviction has occurred

Mean 10,984 Mean age of Blocks at eviction time (seconds)

StdDev 5,853,922 Standard Deviation for age of Blocks at eviction time

Hits 1,861 Number requests that were cache hits

Hits Caching 1,854 Cache hit block requests but only requests set to cache block if a miss

Misses 58 Block requests that were cache misses but set to cache missed blocks

Misses Caching 58 Block requests that were cache misses but only requests set to use block cache

Hit Ratio 96.98% Hit Count divided by total requests count

like image 829
jastang Avatar asked Mar 13 '23 02:03

jastang


1 Answers

What you are seeing is the effect of the LRU treating blocks with three levels of priority: single-access, multi-access, and in-memory. For the default L1 LruBlockCache class their share of the cache can be set with (default values in brackets):

  • hbase.lru.blockcache.single.percentage (25%)
  • hbase.lru.blockcache.multi.percentage (50%)
  • hbase.lru.blockcache.memory.percentage (25%)

For the 4 GB heap example, and 40% set aside for the cache, you have 1.6 GB heap, which is further divided into 400 MB, 800 MB, and 400 MB for each priority level, based on the above percentages.

When a block is loaded from storage it is flagged as single-access usually, unless the column family it belongs to has been configured as IN_MEMORY = true, setting its priority to in-memory (obviously). For single-access blocks, if another read access is requesting the same block, it is flagged as multi-access priority.

The LruBlockCache has an internal eviction thread that runs every 10 seconds and checks if the blocks for each level together are exceeding their allowed percentage. Now, if you scan a larger table once, and assuming the cache was completely empty, all of the blocks are tagged single-access. If the table was 1 GB in size, you have loaded 1 GB into a 400 MB cache space, which the eviction thread then is going to reduce in due course. In fact, dependent on how long the scan is taking, the 10 seconds of the eviction thread is lapsing during the scan and will start to evict blocks once you exceed the 25% threshold.

The eviction will first evict blocks from the single-access area, then the multi-access area, and finally, if there is still pressure on the heap, from the in-memory area. That is also why you should make sure your working set for in-memory flagged column families is not exceeding the configured cache area.

What can you do? If you have mostly single-access blocks, you could tweak the above percentages to give more to the single-access area of the LRU.

like image 195
Lars George Avatar answered Apr 07 '23 21:04

Lars George