How to know the exact block size of a file on a Hadoop node?

Question

I have a 1 GB file that I've put on HDFS. So, it would be broken into blocks and sent to different nodes in the cluster.

Is there any command to identify the exact size of the block of the file on a particular node?

Thanks.

maxteneff · Accepted Answer

You should use hdfs fsck command:

hdfs fsck /tmp/test.txt -files -blocks

This command will print information about all the blocks of which file consists:

/tmp/test.tar.gz 151937000 bytes, 2 block(s):  OK
0. BP-739546456-192.168.20.1-1455713910789:blk_1073742021_1197 len=134217728 Live_repl=3
1. BP-739546456-192.168.20.1-1455713910789:blk_1073742022_1198 len=17719272 Live_repl=3

As you can see here is shown (len field in every row) actual used capacities of blocks.

Also there are many another useful features of hdfs fsck which you can see at the official Hadoop documentation page.

Karthik · Answer

You can try:

hdfs getconf -confKey dfs.blocksize

How to know the exact block size of a file on a Hadoop node?

Tags:

hadoop

hdfs

Mayank Porwal

Video Answer

2 Answers

maxteneff

Karthik

Recent Activity

Donate For Us

How to know the exact block size of a file on a Hadoop node?

Tags:

hadoop

hdfs

Mayank Porwal

Video Answer

2 Answers

maxteneff

Karthik

Related questions

Recent Activity

Donate For Us