Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove blank lines in a file using sed

France  211 55  Europe

Japan   144 120 Asia
Germany 96  61  Europe

England 94  56  Europe




Taiwan  55  144 Asia
North Korea 44  2134    Asia

The above is my data file.

There are empty lines in it.

There are no spaces or tabs in those empty lines.

I want to remove all empty lines in the data.

I did a search Delete empty lines using SED has given the perfect answer.

Before that, I wrote two sed code myself:

sed -r 's/\n\n+/\n/g' cou.data
sed 's/\n\n\n*/\n/g' cou.data

And I tried awk gsub, not successful either.

awk '{ gsub(/\n\n*/, "\n"); print }' cou.data

But they don't work and nothing changes.

Where did I do wrong about my sed code?

like image 493
Sleeping On a Giant's Shoulder Avatar asked Aug 10 '18 09:08

Sleeping On a Giant's Shoulder


People also ask

How do I remove blank lines from a file?

In the file menu, click Search and then Replace. In the Replace box, in the Find what section, type ^\r\n (five characters: caret, backslash 'r', and backslash 'n'). Leave the Replace with section blank unless you want to replace a blank line with other text.

How will you remove blank lines from a file using grep in sed?

The d command in sed can be used to delete the empty lines in a file. Here the ^ specifies the start of the line and $ specifies the end of the line. You can redirect the output of above command and write it into a new file.

How do you delete a line in a file using sed?

To delete a line, we'll use the sed “d” command. Note that you have to declare which line to delete. Otherwise, sed will delete all the lines.


2 Answers

Use the following sed to delete all blank lines.

sed '/./!d' cou.data

Explanation:

  • /./ matches any character, including a newline.
  • ! negates the selector, i.e. it makes the command apply to lines which do not match the selector, which in this case is the empty line(s).
  • d deletes the selected line(s).
  • cou.data is the path to the input file.

Where did you go wrong?

The following excerpt from How sed Works states:

sed operates by performing the following cycle on each line of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.

When the end of the script is reached, unless the -n option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed.8 Then the next cycle starts for the next input line.

I've intentionally emboldened the parts which are pertinent to why your sed examples are not working. Given your examples:

  • They seem to disregard that sed reads one line at a time.
  • The trailing newlines, (\n\n and \n\n\n in your first and second example respectively), which you're trying to match don't actually exist. They've been removed by the time your regexp pattern is executed and then reinstated when the end of the script is reached.
like image 162
RobC Avatar answered Sep 18 '22 21:09

RobC


RobC's answer is great if your lines are terminated by newline (linefeed or \n) only, because SED separates lines that way. If your lines are terminated by \r\n (or CRLF) - which you may have your reasons for doing even on a unix system - you will not get a match, because from sed's perspective the line isn't empty - the \r (CR) counts as a character. Instead you can try:

sed '/^\r$/d' filename

Explanation:

  • ^ matches the start of the line
  • \r matches the carriage return
  • $ matches the end of the line
  • d deletes the selected line(s).
  • filename is the path to the input file.
like image 24
datak Avatar answered Sep 20 '22 21:09

datak