So for example,
A paragraph's newlines would be removed let's say
it contained only single
newlines.
Then the things I would want to skip out:
However.
Our previous pair of newlines wouldn't.
It’s not a sed solution — although you can always run any sed through s2p of course — but a very easy solution using perl is:
% perl -i.orig -ne 'print unless /^$/' file1 file2 file3
That has the advantage of being extensible to any whitespace on the otherwise blank lines, like spaces and tabs:
% perl -i.orig -ne 'print unless /^\s*$/' file1 file2 file3
In the event that have files with various line endings, like CR or CRLF, you could also do this, assuming you are running perl 5.10 or better:
% perl -0777 -i.orig -ne 's/\R+/\n/' file1 file2 file3
which will normalize all sequences of one or more Unicode line separators into single newlines.
If you have UTF‑8 files that might have (for example) U+00A0 NON-BREAK SPACE in them on otherwise empty lines, you can handle them by telling perl that they are UTF‑8 using the ‑CSD command-line switch:
% perl -CSD -i.orig -ne 'print unless /^\s*$/' file1 file2 file3
I’m really unclear what you mean by removing a paragraph. I think you just mean joining up lines in a paragraph.
If so — if what you want to do is squeeze newlines from paragraphs, then you want to do this:
% perl -i.orig -00 -ple 's/\s*\n\s*/ /g' file1 file2 file3
It may not look like it works, but it does: try it.
Here's a sed solution.
$ sed -n -e '1{${p;b};h;b};/^$/!{H;$!b};x;s/\(.\)\n/\1 /g;p' 5751270.txt
A paragraph would be removed let's say it contained only single newlines.
However.
Our previous pair of newlines wouldn't.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With