How to replace a text with another text in a file present at HDFS

Question

I have file.txt in UNIX file system. Its content is below:

{abc}]}
{pqr}]}

I want to convert this file.txt into:

[
{abc}]},
{pqr}]}
]

I am able to do this using below shell script:

sed -i 's/}]}/}]},/g' file.txt
sed -i '1i [' file.txt
sed -i '$ s/}]},/}]}]/g' file.txt

My question is what if this file were present on HDFS at /test location.

If I use : sed -i 's/}]}/}]},/g' /test/file.txt

It would look at unix partition /test and say file does not exist.

If I use : sed -i 's/}]}/}]},/g' | hadoop fs -cat /test/file.txt

It says ----- sed: no input files and then prints content of file.txt as per cat command.

If I use hadoop fs -cat /test/file.txt | sed -i 's/}]}/}]},/g'

It says ---- sed: no input files cat: Unable to write to output stream

So, how shall I replace strings from my file at HDFS with some other string?

PradeepKumbhar · Accepted Answer

With sed and hdfs commands:

hdfs dfs -cat /test/file.txt | sed 's/$/,/g; $s/,$/
]/; 1i [' | hadoop fs -put -f - /test/file.txt

where,

hdfs dfs -cat /test/file.txt is for getting the HDFS file content

s/$/,/g; is for adding a comma at the end of each line

$s/,$/ ]/; is for removing comma at the line and adding a newline with a bracket

1i [ is for adding a bracket at the first line

hadoop fs -put -f - /test/file.txt is for overwriting the original file in HDFS

Donate For Us