Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Hadoop not show incomplete files?

Tags:

hadoop

hdfs

I'm using the command fs -put to copy a huge 100GB file into HDFS. My HDFS block size is 128MB. The file copy takes a long time. My question is while the file copy is in progress, the other users are not able to see the file. Is this by design? How can we enable access to this partial file by another user so that he too can monitor the copy progress.

like image 845
gameover Avatar asked Nov 25 '25 18:11

gameover


1 Answers

The size is shown block by block. So if your bloack size is 128MB, then you'll see the file size as 128MB when the first block is done, then after some time you'll see the size as 256MB and so on until the entire file is copied. So you can use the regular HDFS UI or command line hadoop fs -ls to monitor block-by-block copy progress. You can also read the part that is already copied using hadoop fs -cat even while the copy is in progress.

like image 54
Hari Menon Avatar answered Nov 28 '25 15:11

Hari Menon