How can I decompress and view few lines of a compressed file in hdfs. The below command displays the last few lines of the compressed data <pre class="prettyprint"><code>hadoop fs -tail /myfolder/part-r-00024.gz </code></pre> Is there a way I can use the -text command and pipe the output to tail command? I tried this but this doesn't work. <pre class="prettyprint"><code>hadoop fs -text /myfolder/part-r-00024.gz > hadoop fs -tail /myfolder/ </code></pre>

The following will show you the specified number of lines without decompressing the whole file: <pre class="prettyprint"><code>hadoop fs -cat /hdfs_location/part-00000.gz | zcat | head -n 20 </code></pre> The following will page the file, also without first decompressing the whole of it: <pre class="prettyprint"><code>hadoop fs -cat /hdfs_location/part-00000.gz | zmore </code></pre>

Try the following, should work as long as your file isn't too big (since the whole thing will be decompressed): <pre class="prettyprint"><code>hadoop fs -text /myfolder/part-r-00024.gz | tail </code></pre>

I ended up writing a pig script. <pre class="prettyprint"><code>A = LOAD '/myfolder/part-r-00024.gz' USING PigStorage('\t'); B = LIMIT A 10; DUMP B; </code></pre>

View gzipped file content in hadoop

Tags:

hadoop

How can I decompress and view few lines of a compressed file in hdfs. The below command displays the last few lines of the compressed data

Click to copy

hadoop fs -tail /myfolder/part-r-00024.gz

Is there a way I can use the -text command and pipe the output to tail command? I tried this but this doesn't work.

Click to copy

hadoop fs -text /myfolder/part-r-00024.gz > hadoop fs -tail /myfolder/

339

asked Aug 12 '15 14:08

nobody

3 Answers

The following will show you the specified number of lines without decompressing the whole file:

Click to copy

hadoop fs -cat /hdfs_location/part-00000.gz | zcat | head -n 20

The following will page the file, also without first decompressing the whole of it:

Click to copy

hadoop fs -cat /hdfs_location/part-00000.gz | zmore

135

answered Nov 29 '22 23:11

leo9r

Try the following, should work as long as your file isn't too big (since the whole thing will be decompressed):

Click to copy

hadoop fs -text /myfolder/part-r-00024.gz | tail

answered Nov 30 '22 00:11

mattinbits

I ended up writing a pig script.

Click to copy

A = LOAD '/myfolder/part-r-00024.gz' USING PigStorage('\t');
B = LIMIT A 10;
DUMP B;

answered Nov 30 '22 01:11

nobody

Related questions
                            
                                Cannot connect to hive using beeline, user root cannot impersonate anonymous
                            
                                Efficient and scalable storage for JSON data with NoSQL databases
                            
                                Hadoop dfs replicate
                            
                                java.lang.NoSuchMethodError: org.eclipse.jdt.internal.compiler.CompilationResult
                            
                                Multiple Output Files for Hadoop Streaming with Python Mapper
                            
                                Sqoop Hive table import, Table dataType doesn't match with database
                            
                                How to escape forward slash in java so that to use it in path
                            
                                Slave nodes not in Yarn ResourceManager
                            
                                Hive command to execute NOT IN clause
                            
                                Hive Union Group By Error
                            
                                How to write to HDFS using Scala
                            
                                Why submitting job to mapreduce takes so much time in General?
                            
                                Spark job is failed due to java.io.NotSerializableException: org.apache.spark.SparkContext
                            
                                HIVE - INSERT OVERWRITE using WITH CLAUSE
                            
                                Checking if directory in HDFS is empty or not
                            
                                How to convert a date format YYYY-MM-DD into integer YYYYMMDD in Presto/Hive?
                            
                                HDFS File Comparison
                            
                                What is HBase compaction-queue-size at all?
                            
                                How to use Hive without hadoop
                            
                                auxService:mapreduce_shuffle does not exist on hive

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

View gzipped file content in hadoop

Tags:

hadoop

nobody

People also ask

3 Answers

leo9r

mattinbits

nobody

Recent Activity

Donate For Us