Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hbase memstore manual flush

Tags:

hadoop

hbase

hdfs

According to Hbase design, Hbase uses memstore to store the writes and eventually when the memstore reaches the size limit, it flushes it to HDFS. This flushing exercise is happened automatically behind the theme.

In my case, I want to do a hdfs migration, migrate from one cluster to another, I need to make sure there is nothing left in-memory before I bring down hbase process in the source cluster. Is there anyway we can manually force the flush even tho the memstore hasn't reached the limit.

==question added==

further question: how do you know the flush is completed? via metrics?

like image 803
Shengjie Avatar asked Dec 04 '12 14:12

Shengjie


People also ask

What is HBase flush?

The MemStore is a write buffer where HBase accumulates data in memory before a permanent write. Its contents are flushed to disk to form an HFile when the MemStore fills up. It doesn't write to an existing HFile but instead forms a new file on every flush.

How do I get to HBase shell?

To access the HBase shell, you have to navigate to the HBase home folder. You can start the HBase interactive shell using “hbase shell” command as shown below. If you have successfully installed HBase in your system, then it gives you the HBase shell prompt as shown below.

What is Wal HBase?

The Write Ahead Log ( WAL ) records all changes to data in HBase, to file-based storage. if a RegionServer crashes or becomes unavailable before the MemStore is flushed, the WAL ensures that the changes to the data can be replayed.

What is the function of HBase ENV SH Apache HBase configuration file?

hbase-env.sh provides a handy mechanism to do this. HBase uses the Secure Shell (ssh) command and utilities extensively to communicate between cluster nodes. Each server in the cluster must be running ssh so that the Hadoop and HBase daemons can be managed.


2 Answers

From the shell you can just do flush 'tableName' to flush the memstore.

But if you want to do a backup of /hbase/table folder via hdfs, the way to do that is:

  • disable the table: (from the shell: disable 'tableName')
  • copy files: hadoop fs -cp /hbase/tableName /hbase-backup/tableName
  • enable the table: (from the shell: enable 'tableName')

...or you can use the CopyTable or Export tools (http://hbase.apache.org/book/ops.backup.html)

like image 134
th30z Avatar answered Oct 03 '22 15:10

th30z


since you are going to migrate the hbase whole DB, you might want to do a batch disable:

disable_all '.*'

this will force hbase to flush out the memstore and write everything into HFiles. you will also notice that even after the disable, you will still see some WALs under /hbase/WALs, but don't worry, that's because hbase have a WAL ttl which keeps the WALs for a while even after flushing into HFiles.
to answer your question "how to verify flush is complete":
go to Hbase UI -> Regions -> Memory
you will see Memstore Size, make sure they are all "0"s.

like image 44
linehrr Avatar answered Oct 03 '22 16:10

linehrr