Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove non-ASCII characters from CSV

Tags:

sed

awk

I want to remove all the non-ASCII characters from a file in place.

I found one solution with tr, but I guess I need to write back that file after modification.

I need to do it in place with relatively good performance.

Any suggestions?

like image 607
Sujit Avatar asked Jul 26 '10 18:07

Sujit


People also ask

How do I remove non-ASCII characters from a string?

Remove Non-ASCII Characters From Text Python Here we can use the replace() method for removing the non-ASCII characters from the string. In Python the str. replace() is an inbuilt function and this method will help the user to replace old characters with a new or empty string.

How do I remove non-ASCII characters in Excel?

Step 1: Click on any cell (D3). Enter Formula =CLEAN(C3). Step 2: Click ENTER. It removes non-printable characters.


2 Answers

A perl oneliner would do: perl -i.bak -pe 's/[^[:ascii:]]//g' <your file>

-i says that the file is going to be edited inplace, and the backup is going to be saved with extension .bak.

like image 69
ssegvic Avatar answered Oct 15 '22 19:10

ssegvic


# -i (inplace)  sed -i 's/[\d128-\d255]//g' FILENAME 
like image 37
Ivan Avatar answered Oct 15 '22 18:10

Ivan