how to find file from blockName in HDFS hadoop

Question

What's the easiest way to find file associated with a block in HDFS given a block Name/ID

Abhijith · Accepted Answer

Not sure when this was introduced but you can do this

hdfs fsck -blockId <block_id>

hdfs fsck -blockId blk_1100790203
Connecting to namenode 
FSCK started by hdfs 

Block Id: blk_1100790203
Block belongs to: /tmp/1447685899336.txt

secfree · Answer

Option 1: the suffix .meta is needed if using the blockId with generationStamp

$ hdfs fsck -blockId blk_1073823706_82968.meta

Option 2: use the blockId without generationStamp

$ hdfs fsck -blockId blk_1073823706

Chris White · Answer

The long and painful way, assuming you have read access to all the files (and execute for the directories):

hadoop fsck / -files -blocks | grep blk_520275863902385418_1002 -B 20

Then scan back up from your block match to the previous file name:

/hadoop/mapred/system/jobtracker.info 4 bytes, 1 block(s):  OK
0. blk_520275863902385418_1002 len=4 repl=1

In this case blk_5202... is part of the /hadoop/mapred/system/jobtracker.info file

Programmatically, these isn't an interface to the name node that allows you to search by block ID, but you could look into the source for the secondary name node and see how it consolidates the edits - then experiment on the saved output from the secondary name node (rather than risking working on the live name node file).

Good luck!

how to find file from blockName in HDFS hadoop

Tags:

hadoop

hdfs

Inder Singh

3 Answers

Abhijith

secfree

Chris White

Recent Activity

Donate For Us

how to find file from blockName in HDFS hadoop

Tags:

hadoop

hdfs

Inder Singh

3 Answers

Abhijith

secfree

Chris White

Related questions

Recent Activity

Donate For Us