I have a file, that consists of a repeating sequence of three lines, that I want to merge together. Put in other words, I'd like to replace every but third \n
into space. E.g. I'd like the transform input
href="file:///home/adam/MyDocs/some_file.pdf"
visited="2013-06-02T20:40:06Z"
exec="'firefox %u'"
href="file:///home/adam/Desktop/FreeRDP-WebConnect-1.0.0.167-Setup.exe"
visited="2013-06-03T08:50:37Z"
exec="'firefox %u'"
href="file:///home/adam/Friends/contact.txt"
visited="2013-06-03T16:01:16Z"
exec="'gedit %u'"
href="file:///home/adam/Pictures/Screenshot%20from%202013-06-03%2019:10:36.png"
visited="2013-06-03T17:10:36Z"
exec="'eog %u'"
into
href="file:///home/adam/MyDocs/some_file.pdf" visited="2013-06-02T20:40:06Z" exec="'firefox %u'"
href="file:///home/adam/Desktop/FreeRDP-WebConnect-1.0.0.167-Setup.exe" visited="2013-06-03T08:50:37Z" exec="'firefox %u'"
href="file:///home/adam/Friends/contact.txt" visited="2013-06-03T16:01:16Z" exec="'gedit %u'"
href="file:///home/adam/Pictures/Screenshot%20from%202013-06-03%2019:10:36.png" visited="2013-06-03T17:10:36Z" exec="'eog %u'"
Unfortunately the file is rather long, so I'd prefer not to load the whole file into memory and not to write to result back into file - just print the concatenated lines into the standard output so I can pipe it further.
I know that potentially sed
might just work for it, but after I had given it a honest try, I am still at square one; the learning curve is just too steep for me. :-(
I did a rough benchmarking and I found out, that the sed
variant is almost twice as fast.
time awk '{ printf "%s", $0; if (NR % 3 == 0) print ""; else printf " " }' out.txt >/dev/null
real 0m1.893s
user 0m1.860s
sys 0m0.028s
and
time cat out.txt | sed 'N;N;s/\n/ /g' > /dev/null
real 0m1.360s
user 0m1.264s
sys 0m0.236s
It is interesting: why does sed
require more kernel time than awk
?
The out.txt is 200MB long and the processor is Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz on Linux-Mint 14 with kernel 3.8.13-030813-generic.
I need this in my effort to parse the recently-used.xbel
, the recently opened files list in the Cinnamon
If you came here for this specific problem, this line should help you:
xpath -q -e "//bookmark[*]/@href | //bookmark[*]/@visited | //bookmark[*]/info/metadata/bookmark:applications[1]/bookmark:application[1]/@exec" recently-used.xbel | sed 's/href="\(.*\)"/"\1"/;N;s/visited="\(.*\)"/\1/;N;s/exec="\(.*\)"/"\1"/;s/\n/ /g' | xargs -n3 whatever-script-you-write
In the two commands above, we passed two options to the paste command: -s and -d. The paste command can merge lines from multiple input files. By default, it merges lines in a way that entries in the first column belong to the first file, those in the second column are for the second file, and so on.
sed operates by performing the following cycle on each lines of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; […]. Add a newline to the pattern space, then append the next line of input to the pattern space.
The following `sed` command will replace two consecutive lines with another line. Here, the -z option is used to replace the consecutive lines with null data before adding the replacement text. According to the command, the 3 rd and 4 th lines of the file will be replaced by the text, ‘It is a very useful tool’.
Create a sed file named to replace.sed with the following content to replace the multiple lines based on the search pattern. Here, the word ‘ CSE ‘ will be searched in the text file, and if the match exists, then it will again search the number 35 and 15. If the second match exists in the file, then it will be replaced by the number 45.
The tr command is also ideal for joining multiple lines without delimiters, with spacing, and with a single character delimiter. Pipe the output to a sed command to get rid of the extra commas. We have successfully learned how to join multiple lines into one single line in a file in Linux.
Here, the -z option is used to replace the consecutive lines with null data before adding the replacement text. According to the command, the 3 rd and 4 th lines of the file will be replaced by the text, ‘It is a very useful tool’. The following output will appear after running the commands.
how about this:
sed 'N;N;s/\n/ /g' file
You can use awk
to do this pretty easily:
awk '{ printf "%s", $0; if (NR % 3 == 0) print ""; else printf " " }' file
The basic idea is "print each line folowed by a space, unless it's every third line, in which case print a newline".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With