Problem I am trying to build a secondary index with Phoenix. Index creation takes several hours. It seems to be due to slow HBase scans, as I noticed the following performance : <ul> <li>I might need 2 hours to scan the table, whereas other developers reported a few minutes for larger tables (100 millions rows).</li> <li>HBase shell is able to count rows at an approx. rate of 10.000 per second, which means 3800s (>1 hour!) to count all rows of this table.</li> </ul> Both with HBase shell and a Java scanner. NB : The GET(by rowkey) operation is achieved with good performances (approx 0.5s). <hr> Context <ul> <li>38 millions rows / 1000 columns / single column family / 96Go with GZ compression.</li> <li>Cluster has 6 nodes (126Go RAM, 24 cores) with 5 region servers.</li> <li>Hortonworks Data Platform 2.2.0</li> </ul> <hr> Troubleshooting Based on the HBase book (http://hbase.apache.org/book.html#performance), here is what I already checked : 1) Hardware <ul> <li>IO(disk) <ul> <li>NMon says disk are never busy more than 80%, and most frequently between 0 and 20%</li> <li>Top says HBase JVM's are not swapping (checked 2 of 5 RS)</li> </ul> </li> <li>IO(network) : each node active interface stand on the same switch (all second passive interface are plugged on a different switch)</li> </ul> 2) JVM <ul> <li>GC pauses OK (few milliseconds pause every minute or so)</li> <li>Heap looks OK (not peaking too long near the limit)</li> <li>CPU is suprisingly LOW : never more than 10%</li> <li>Threads : <ul> <li>Active threads (10 "RpServe.reader=N" + a few other) show no contention</li> <li>Lot of parked thread doing nothing (60 "DefaultRpcServer.handler=n", approx 15 other)</li> <li>Huge list of IPC Client without any thread status</li> </ul> </li> </ul> 3) Data <ul> <li>was bulk loaded using Hive + completebulkload.</li> <li>Number of region : <ul> <li>13 regions meaning we have 2 to 3 large regions for each RS, which is what is expected.</li> <li>Scan performance remains unchanged after forcing a major compaction.</li> <li>Region size is rather homogeneous : 4,5Go (+/-0.5) for 11 regions, 2,5Go for 2 regions </li> </ul> </li> </ul> 4) HBase configuration <ul> <li> Most configuration remained unchanged. <ul> <li>HBase env only indicates ports for JMX console</li> <li>HBase-site has few settings for Phoenix</li> </ul> </li> <li> Some of the params that looked OK to me <ul> <li>hbase.hregion.memstore.block.multiplier</li> <li>hbase.hregion.memstore.flush.size : 134217728 bytes (134Go)</li> <li>Xmn ratio of Xmx : .2 Xmn max value : 512 Mb Xms : 6144m</li> <li>hbase.regionserver.global.memstore.lowerLimit : 0.38</li> <li>hbase.hstore.compactionTreshold : 3</li> <li>hfile.block.cache.size : 0.4 (Block cache size AS % of heap)</li> <li>Maximum HStoreFile (hbase.hregion.max.filesize) : 10 go (10737418240)</li> <li>Client scanner cache : 100 rows zookeeper timeout : 30s</li> <li>Client max keyvalue size : 10mo</li> <li>hbase.regionserver.global.memstore.lowerLimit : 0.38</li> <li>hbase.regionserver.global.memstore.upperLimit : 0.40</li> <li>hstore blocking storefiles : 10</li> <li>hbase.hregion.memstore.mslab.enabled :</li> <li>enabled hbase.hregion.majorcompaction.jitter : 0.5</li> </ul> </li> <li> Tried following configuration changes without any impact on performance <ul> <li>hbase-env.sh : tried to increase HBASE_HEAPSIZE=6144 (since it default at 1000)</li> <li>hbase-site.xml : <ul> <li>hbase.ipc.server.callqueue.read.ratio : 0.9</li> <li>hbase.ipc.server.callqueue.scan.ratio : 0.9</li> </ul> </li> </ul> </li> </ul> 5) Log say nothing usefull cat hbase-hbase-master-cox.log | grep "2015-05-11.*ERROR" cat hbase-hbase-regionserver-*.log | grep "2015-05-11.*ERROR" print nothing Printing WARNs shows non related errors 2015-05-11 17:11:10,544 WARN [B.DefaultRpcServer.handler=8,queue=2,port=60020] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x2aca5fca): could not load 1074749724_BP-2077371184-184.10.17.65-1423758745093 due to InvalidToken exception. 2015-05-11 17:09:12,848 WARN [regionserver60020-smallCompactions-1430754386533] hbase.HBaseConfiguration: Config option "hbase.regionserver.lease.period" is deprecated. Instead, use "hbase.client.scanner.timeout.period"

Got it : the key is to separate "hot" content from "cold" content into separate column families. Column families are used to store columns in separate HFiles, so we can use one column family for indexed (or frequently read) columns, and one other column family (thus file) for all other columns. First step : see that smaller column family is faster to scan We simply discard cold content to build a single smaller column family (1655 columns -> 7 columns). Performances on medium size table scans : <ul> <li>[37.876.602 rows, 1655 columns] scan 1000 rows took 39.4750</li> <li>[76.611.463 rows, 7 columns] scan 1000 rows took 1.8620</li> </ul> Remarks : <ul> <li>total number of rows can be ignored as we scan the first 1000 rows</li> <li>there is an overhead with large rows as scanning from Hbase shell prints content in console</li> </ul> Second step : generate multi-family HTable We do a bulk load by generating HFiles from Hive. Although the doc says we can't generate one multi family table, one can generate HFiles separately : <pre class="prettyprint"><code>create table mytable_f1 (UUID string, source_col1, source_col2) ... TBLPROPERTIES('hfile.family.path' = 'tmp/mytable/**f1**'); create table mytable_f1 (UUID string, source_col3, source_col4) ... TBLPROPERTIES('hfile.family.path' = 'tmp/mytable/f2'); </code></pre> And then simply call import command as usual : <pre class="prettyprint"><code>hadoop jar [hbase-server-jar] completebulkload /tmp/mytable mytable </code></pre>

<ul> <li>TurnOff blockcache at time of scan (it is churning ur heap memory)</li> <li>Figure out whats the size of ur record , if its > 1 MB , please increase hbase.scanner.timeout period scan.setCacheBlocks(false);</li> <li>scan.setCaching(x) x * record size what is getting fetched in one short , make sure it is close to 1 MB . </li> <li>some necessary check : make sure regions for the Tabled being scanned are equally distributed across regions . </li> </ul> (If u have done bulk load run a Major compaction Once )

HBase scans are slow

Tags:

hbase

phoenix

Problem

I am trying to build a secondary index with Phoenix. Index creation takes several hours. It seems to be due to slow HBase scans, as I noticed the following performance :

I might need 2 hours to scan the table, whereas other developers reported a few minutes for larger tables (100 millions rows).
HBase shell is able to count rows at an approx. rate of 10.000 per second, which means 3800s (>1 hour!) to count all rows of this table.

Both with HBase shell and a Java scanner.

NB : The GET(by rowkey) operation is achieved with good performances (approx 0.5s).

Context

38 millions rows / 1000 columns / single column family / 96Go with GZ compression.
Cluster has 6 nodes (126Go RAM, 24 cores) with 5 region servers.
Hortonworks Data Platform 2.2.0

Troubleshooting

Based on the HBase book (http://hbase.apache.org/book.html#performance), here is what I already checked :

1) Hardware

IO(disk)
- NMon says disk are never busy more than 80%, and most frequently between 0 and 20%
- Top says HBase JVM's are not swapping (checked 2 of 5 RS)
IO(network) : each node active interface stand on the same switch (all second passive interface are plugged on a different switch)

2) JVM

GC pauses OK (few milliseconds pause every minute or so)
Heap looks OK (not peaking too long near the limit)
CPU is suprisingly LOW : never more than 10%
Threads :
- Active threads (10 "RpServe.reader=N" + a few other) show no contention
- Lot of parked thread doing nothing (60 "DefaultRpcServer.handler=n", approx 15 other)
- Huge list of IPC Client without any thread status

3) Data

was bulk loaded using Hive + completebulkload.
Number of region :
- 13 regions meaning we have 2 to 3 large regions for each RS, which is what is expected.
- Scan performance remains unchanged after forcing a major compaction.
- Region size is rather homogeneous : 4,5Go (+/-0.5) for 11 regions, 2,5Go for 2 regions

4) HBase configuration

Most configuration remained unchanged.
- HBase env only indicates ports for JMX console
- HBase-site has few settings for Phoenix
Some of the params that looked OK to me
- hbase.hregion.memstore.block.multiplier
- hbase.hregion.memstore.flush.size : 134217728 bytes (134Go)
- Xmn ratio of Xmx : .2 Xmn max value : 512 Mb Xms : 6144m
- hbase.regionserver.global.memstore.lowerLimit : 0.38
- hbase.hstore.compactionTreshold : 3
- hfile.block.cache.size : 0.4 (Block cache size AS % of heap)
- Maximum HStoreFile (hbase.hregion.max.filesize) : 10 go (10737418240)
- Client scanner cache : 100 rows zookeeper timeout : 30s
- Client max keyvalue size : 10mo
- hbase.regionserver.global.memstore.lowerLimit : 0.38
- hbase.regionserver.global.memstore.upperLimit : 0.40
- hstore blocking storefiles : 10
- hbase.hregion.memstore.mslab.enabled :
- enabled hbase.hregion.majorcompaction.jitter : 0.5
Tried following configuration changes without any impact on performance
- hbase-env.sh : tried to increase HBASE_HEAPSIZE=6144 (since it default at 1000)
- hbase-site.xml :
  - hbase.ipc.server.callqueue.read.ratio : 0.9
  - hbase.ipc.server.callqueue.scan.ratio : 0.9

5) Log say nothing usefull

cat hbase-hbase-master-cox.log | grep "2015-05-11.*ERROR"

cat hbase-hbase-regionserver-*.log | grep "2015-05-11.*ERROR"

print nothing

Printing WARNs shows non related errors

2015-05-11 17:11:10,544 WARN [B.DefaultRpcServer.handler=8,queue=2,port=60020] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x2aca5fca): could not load 1074749724_BP-2077371184-184.10.17.65-1423758745093 due to InvalidToken exception.

2015-05-11 17:09:12,848 WARN [regionserver60020-smallCompactions-1430754386533] hbase.HBaseConfiguration: Config option "hbase.regionserver.lease.period" is deprecated. Instead, use "hbase.client.scanner.timeout.period"

765

asked May 06 '15 12:05

Martin

2 Answers

Got it : the key is to separate "hot" content from "cold" content into separate column families. Column families are used to store columns in separate HFiles, so we can use one column family for indexed (or frequently read) columns, and one other column family (thus file) for all other columns.

First step : see that smaller column family is faster to scan

We simply discard cold content to build a single smaller column family (1655 columns -> 7 columns).

Performances on medium size table scans :

[37.876.602 rows, 1655 columns] scan 1000 rows took 39.4750
[76.611.463 rows, 7 columns] scan 1000 rows took 1.8620

Remarks :

total number of rows can be ignored as we scan the first 1000 rows
there is an overhead with large rows as scanning from Hbase shell prints content in console

Second step : generate multi-family HTable

We do a bulk load by generating HFiles from Hive. Although the doc says we can't generate one multi family table, one can generate HFiles separately :

create table mytable_f1 (UUID string, source_col1, source_col2)
...
TBLPROPERTIES('hfile.family.path' = 'tmp/mytable/**f1**');

create table mytable_f1 (UUID string, source_col3, source_col4)
...
TBLPROPERTIES('hfile.family.path' = 'tmp/mytable/f2');

And then simply call import command as usual :

hadoop jar [hbase-server-jar] completebulkload /tmp/mytable mytable

answered Sep 20 '22 13:09

Martin

TurnOff blockcache at time of scan (it is churning ur heap memory)
Figure out whats the size of ur record , if its > 1 MB , please increase hbase.scanner.timeout period scan.setCacheBlocks(false);
scan.setCaching(x) x * record size what is getting fetched in one short , make sure it is close to 1 MB .
some necessary check : make sure regions for the Tabled being scanned are equally distributed across regions .

(If u have done bulk load run a Major compaction Once )

answered Sep 22 '22 13:09

KrazyGautam

Related questions
                            
                                What is "Hadoop" - the definition of Hadoop?
                            
                                Hadoop Hbase: Spreading column families across tables or not
                            
                                Multiple databases or namespace in Hbase
                            
                                Is it better to use HBase columns or serialize data using Avro?
                            
                                How does HBase enable Random Access to HDFS?
                            
                                Random access performance in HBase & block size in HDFS
                            
                                Import data from HDFS to HBase (cdh3u2)
                            
                                Hbase mapreduce error
                            
                                What it the difference between Hbase and BigTable?
                            
                                How describe Hbase column family?
                            
                                HBase column families: how many?
                            
                                HBase row key design for monotonically increasing keys
                            
                                creating partition in external table in hive
                            
                                GUI Tool for HBase Management [closed]
                            
                                Does Spark use data locality?
                            
                                Hbase client ConnectionLoss for /hbase error
                            
                                HBase Error - assignment of -ROOT- failure
                            
                                How do I get a Row with HBASE Shell where the rowkey is in Hexadecimal?
                            
                                Specify multiple filters in hbase

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

HBase scans are slow

Tags:

hbase

phoenix

Martin

People also ask

2 Answers

Martin

KrazyGautam

Recent Activity

Donate For Us