Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sed not finding '0A' control character

Tags:

linux

sed

The file contains a carriage return line feed sequence hex 'DA'. Hexedit clearly shows the two hex characters.

20 0D 0A 31

and

sed  -n '/\x0D/p' ./test.txt

Clearly identifies the lines However

sed  -n '/\x0A/p' ./test.txt

does not find any lines.

To make this even more interesting after using sed to remove the '0D' it did not find the '0A' in the string:

20 0A 31

How can sed be used to remove 0D0A after a specific character string. The file has extraneous line feeds after specific text. This creates 2 lines where there should be one. The objective is to recreate one one from two.

like image 734
dan sawyer Avatar asked Feb 07 '23 20:02

dan sawyer


2 Answers

sed reads the input line by line; lines are separated by \x0a.

You can use Perl instead, -0777 reads the whole file, not line by line:

perl -0777 -pe 's/\x0d\x0a//'
like image 86
choroba Avatar answered Feb 09 '23 11:02

choroba


In sed, unlike say python, the newline character \n is considered a line separator and not part of any line. If you want to modify the newline characters, read the whole file in at once and then do your substitutions:

sed 'H;1h;$!d;x; s/something\n/somethingelse/g'

How it works

sed has both a pattern space (where newly-read lines go) and a hold space. These commands append the newly-read lines to the hold space until we reach the last line. Then the whole of the file is transferred to the pattern space and you can do whatever commands on it that you wish. In detail:

  • H - Append current line to hold space
  • 1h - If this is the first line, overwrite the hold space with it
  • $!d - If this is not the last line, delete pattern space and jump to the next line.
  • x - Exchange hold and pattern space to put whole file in pattern space

Example

Let's consider this sample file:

$ cat file
Try to av-
oid word
splits.

Now, let's merge any line that ends with a hyphen with the line that follows:

$ sed 'H;1h;$!d;x; s/-\n//g' file
Try to avoid word
splits.

Or, if you prefer the hex form:

$ sed 'H;1h;$!d;x; s/-\x0a//g' file2
Try to avoid word
splits.
like image 22
John1024 Avatar answered Feb 09 '23 11:02

John1024