Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to decompress the hadoop reduce output file end with snappy?




Our hadoop cluster using snappy as default codec. Hadoop job reduce output file name is like part-r-00000.snappy. JSnappy fails to decompress the file bcz JSnappy requires the file start with SNZ. The reduce output file start with some bytes 0 somehow.

How could I decompress the file?

like image 683
DeepNightTwo Avatar asked Nov 06 '13 05:11


1 Answers

Use "Hadoop fs -text" to read this file and pipe it to txt file. ex:

hadoop fs -text part-r-00001.snappy > /tmp/mydatafile.txt

like image 61
arviarya Avatar answered Sep 24 '22 09:09
