Probably a noob question but is there a way to read the contents of file in hdfs besides copying to local and reading thru unix?
So right now what I am doing is:
bin/hadoop dfs -copyToLocal hdfs/path local/path nano local/path
I am wondering if I can open a file directly to hdfs rather than copying it on local and then opening it.
You can use the Hadoop filesystem command to read any file. It supports the cat command to read the content.
Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args> Options: -d: Directories are listed as plain files. -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864). -R: Recursively list subdirectories encountered. -t: Sort output by modification time (most recent first).
I believe hadoop fs -cat <file>
should do the job.
If the file size is huge (which will be the case most of the times), by doing 'cat' you don't want to blow up your terminal by throwing the entire content of your file. Instead, use piping and get only few lines of the file.
To get the first 10 lines of the file, hadoop fs -cat 'file path' | head -10
To get the last 5 lines of the file, hadoop fs -cat 'file path' | tail -5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With