Identify and remove specific hidden characters from text file

Question

I have a text file that contains several hidden characters. Using cat -v I am able to see that they include the following;

^M

^[[A

There are also characters at the end of the line. I would like to be able to display these as well somehow.

Then I would like to be able to selectively cut and sed these hidden characters. How would I go able accomplishing this?

I've tried dos2unix but that didn't help remove any of the ^M characters. I've also tried sed s/^M//g wherein I pressed ctrl+v m.

Raw data

Output from cat -v on the raw data, also available at: http://pastebin.com/Vk2i81JC

^MCopying non-tried blocks... Pass 1 (forwards)^M^[[A^[[A^[[Arescued:         0 B,  errsize:       0 B,  current rate:        0 B/s
   ipos:         0 B,   errors:       0,    average rate:        0 B/s
   opos:         0 B, run time:       1 s,  successful read:       1 s ago
^MFinished

Output wanted

Also available at: http://pastebin.com/wfDnrELm

rescued:         0 B,  errsize:       0 B,  current rate:        0 B/s
   ipos:         0 B,   errors:       0,    average rate:        0 B/s
   opos:         0 B, run time:       1 s,  successful read:       1 s ago
Finished

Ram · Accepted Answer

Try the below tr command which is used to translate or delete characters. The below command removes all the characters other than the one specified in octal within the quotes

octal \12 - new line( ), octal \11 - TAB(^I), octal \40-\176 - are good characters.

For a complete reference of octal values refer to this page: https://courses.engr.illinois.edu/ece390/books/labmanual/ascii-code-table.html

tr -cd '\11\12\40-\176' < org.txt > new.txt

The file new.txt will contain the characters removed.

To remove the characters between ^M and remove the unnecessary control characters use the below command

sed "s/
.*
//g" org.txt | tr -cd '\11\12\40-\176' > new.txt

Identify and remove specific hidden characters from text file

Tags:

bash

unix

sed

Raw data

Output wanted

p014k

1 Answers

Ram

Recent Activity

Donate For Us

Identify and remove specific hidden characters from text file

Tags:

bash

unix

sed

Raw data

Output wanted

p014k

1 Answers

Ram

Related questions

Recent Activity

Donate For Us