Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop Balancer command WARN messages - threads quota is exceeded

Tags:

hadoop2

I am trying to run Hadoop balancer command as follows:
hadoop balancer -threshold 1
But I am getting several WARN messages as

Failed to move blk_1073742036_1212 with size=134217728 from 192.168.30.4:50010 to 192.168.30.2:50010 through 192.168.30.4:50010: block move is failed: Not able to receive block 1073742036 from /192.168.10.3:53115 because threads quota is exceeded.

And at the end...
No block has been moved for 5 iterations. Exiting... Balancing took 4.092883333333333 minutes

I set ulimit values as follows:

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2065455
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 64000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 65535
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

But still I am getting the same error.

Could someone please give me suggestions on this. Appreciate your help.

like image 538
Sravan Kumar Avatar asked Aug 09 '14 20:08

Sravan Kumar


1 Answers

Question was asked a long time ago, posting an answer for posterity's sake.

The Hadoop balancer has a bug where it prematurely exits iterations. This caused the balancer to be very slow. This was fixed in HDFS-6621 and officially released as part of Apache Hadoop 2.6.0. Since this is a bug in the Balancer itself, it is possible to run an updated version of the Balancer without upgrading your cluster.

Datanodes will limit the number of threads used for balancing so as to not eat up all the resources of the cluster/datanode. This is what causes the WARN statement you're seeing. By default the number of threads is 5. This was not configurable prior to Apache Hadoop 2.5.0. HDFS-6595 added this proeprty dfs.datanode.balance.max.concurrent.moves to allow you to control the number of threads used for balancing. Since this is a datanode side property, this will require an upgrade to your cluster if you want to use this setting.

If you're using a distribution packaged by a vendor (e.g. Hortonworks, Cloudera, etc), the mentioned fixes may have been back-patched to an earlier version. Check your vendors release notes to find out.

like image 89
Pradeep Gollakota Avatar answered Sep 30 '22 18:09

Pradeep Gollakota