Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hdfs log file is too huge

Tags:

hadoop

hdfs

after a lot of read and write operation to the hdfs , (i don't know the exact operation that cause this problem). these two files : dncp_block_verification.log.curr , dncp_block_verification.log.prev are more than 200 000 000 000 byte each.

what operation to hdfs may cause these file grow fast?

from the internet I know that I could shotdown the hdfs and delete the log,but it is not the good solution. how to avoid this problem? thank you very much

like image 510
DunkOnly Avatar asked Aug 12 '14 08:08

DunkOnly


2 Answers

The block scanner is what is causing the files to grow. Here's a link to an article explaining the behavior: http://aosabook.org/en/hdfs.html (Section 8.3.5).

The bug which causes this has been fixed in HDFS 2.6.0

like image 115
B. Griffiths Avatar answered Oct 16 '22 03:10

B. Griffiths


I've run into a similar situation with my 20 datanode cluster. I've seen several reports of this reportedly being a bug. I'm seeing this behavior with CDH 5.0.2 which runs HDFS 2.3.x.

I had 1 node out of 20 use 100% of all available space because of the two dncp_block_verification logs. The other nodes seem to have very typical log sizes and I can't figure out why this one node had an issue.

I manually deleted the two files and that resolved my issue with no adverse behavior. Unfortunately, I don't know of a fix or automated solution to this problem.

like image 27
Jaraxal Avatar answered Oct 16 '22 03:10

Jaraxal