Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UNIX: Replace Newline w/ Colon, Preserving Newline Before EOF

I have a text file ("INPUT.txt") of the format:

A<LF>
B<LF>
C<LF>
D<LF>
X<LF>
Y<LF>
Z<LF>
<EOF>

which I need to reformat to:

A:B:C:D:X:Y:Z<LF>
<EOF>

I know you can do this with 'sed'. There's a billion google hits for doing this with 'sed'. But I'm trying to emphasis readability, simplicity, and using the correct tool for the correct job. 'sed' is a line editor that consumes and hides newlines. Probably not the right tool for this job!

I think the correct tool for this job would be 'tr'. I can replace all the newlines with colons with the command:

cat INPUT.txt | tr '\n' ':'

There's 99% of my work done. I have a problem, now, though. By replacing all the newlines with colons, I not only get an extraneous colon at the end of the sequence, but I also lose the carriage return at the end of the input. It looks like this:

A:B:C:D:X:Y:Z:<EOF>

Now, I need to remove the colon from the end of the input. However, if I attempt to pass this processed input through 'sed' to remove the final colon (which would now, I think, be a proper use of 'sed'), I find myself with a second problem. The input is no longer terminated by a newline at all! 'sed' fails outright, for all commands, because it never finds the end of the first line of input!

It seems like appending a newline to the end of some input is a very, very common task, and considering I myself was just sorely tempted to write a program to do it in C (which would take about eight lines of code), I can't imagine there's not already a very simple way to do this with the tools already available to you in the Linux kernel.

like image 956
Maarx Avatar asked May 26 '10 18:05

Maarx


People also ask

How do you replace a new line character with space in Unix?

Using `sed` to replace \n with a comma By default, every line ends with \n when creating a file. The `sed` command can easily split on \n and replace the newline with any character. Another delimiter can be used in place of \n, but only when GNU sed is used.

How do you escape a new line in sed?

The backslash (\) in the replacement string of the sed substitution command is generally used to escape other metacharacters, but it is also used to include a newline in a replacement string.


3 Answers

This should do the job (cat and echo are unnecessary):

tr '\n' ':' < INPUT.TXT | sed 's/:$/\n/'

Using only sed:

sed -n ':a; $ ! {N;ba}; s/\n/:/g;p' INPUT.TXT

Bash without any externals:

string=($(<INPUT.TXT))
string=${string[@]/%/:}
string=${string//: /:}
string=${string%*:}

Using a loop in sh:

colon=''
while read -r line
do
    string=$string$colon$line
    colon=':'
done < INPUT.TXT

Using AWK:

awk '{a=a colon $0; colon=":"} END {print a}' INPUT.TXT

Or:

awk '{printf colon $0; colon=":"} END {printf "\n" }' INPUT.TXT

Edit:

Here's another way in pure Bash:

string=($(<INPUT.TXT))
saveIFS=$IFS
IFS=':'
newstring="${string[*]}"
IFS=$saveIFS

Edit 2:

Here's yet another way which does use echo:

echo "$(tr '\n' ':' < INPUT.TXT | head -c -1)"
like image 130
Dennis Williamson Avatar answered Nov 16 '22 02:11

Dennis Williamson


Old question, but

paste -sd: INPUT.txt
like image 21
huon Avatar answered Nov 16 '22 00:11

huon


Here's yet another solution: (assumes a character set where ':' is octal 72, eg ascii)

perl -l72 -pe '$\="\n" if eof' INPUT.TXT
like image 26
William Pursell Avatar answered Nov 16 '22 00:11

William Pursell