I want to remove non-ascii chars from some file. I have already tried these many regexs.
sed -e 's/[\d00-\d128]//g' # not working
cat /bin/mkdir | sed -e 's/[\x00-\x7F]//g' >/tmp/aa
but this file contains some non-ascii chars.
[root@asssdsada ~]$ hexdump /tmp/aa |more
00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F 0123456789ABCDEF
00000000 45 4C 46 B0 F0 73 38 C0 - C0 BC BC FF FF 61 61 61 ELF..s8......aaa
00000010 A0 A0 50 E5 74 64 50 57 - 50 57 50 57 D4 D4 51 E5 ..P.tdPWPWPW..Q.
00000020 74 64 6C 69 62 36 34 6C - 64 6C 69 6E 75 78 78 38 tdlib64ldlinuxx8
00000030 36 36 34 73 6F 32 47 4E - 55 42 C8 C0 80 70 69 42 664so2GNUB...piB
00000040 44 47 BA E3 92 43 45 D5 - EC 46 E4 DE D8 71 58 B9 DG...CE..F...qX.
00000050 8D F1 EA D3 EF 4B 86 FC - A9 DA 79 ED 63 B5 51 92 .....K....y.c.Q.
00000060 BA 6C FC D1 69 78 30 ED - 74 F1 73 95 CC 85 D2 46 .l..ix0.t.s....F
00000070 A5 B4 6C 67 DA 4A E9 9A - 4B 58 77 A4 37 80 C0 4F ..lg.J..KXw.7..O
00000080 F3 E9 B2 77 65 97 74 F9 - A2 C0 F2 CC 4A 9C 58 A1 ...we.t.....J.X.
Bring up the command palette with CTRL+SHIFT+P (Windows, Linux) or CMD+SHIFT+P on Mac. Type Remove Non ASCII Chars until you see the commands. Select Remove non Ascii characters (File) for removing in the entire file, or Remove non Ascii characters (Select) for removing only in the selected text.
ASCII character set is captured using regex [A-Za-z0-9]. You can use this regex in your query as shown below, to find non-ASCII characters. mysql> SELECT * FROM data WHERE full_name NOT REGEXP '[A-Za-z0-9]'; You can also customize regex to include certain characters.
The Unix/Linux “tr” command The tr command is one of the true “filters” in the Unix operating system, because it works only on input/output streams, and not on files. The -d flag is what tells tr to delete the characters you supply.
encode() to remove non-ASCII characters. Call str. encode(encoding, errors) with encoding as "ASCII" and errors as "ignore" to return str without "ASCII" characters.
This doesn't seem to work with sed
. Perhaps tr
will do?
tr -d '\200-\377'
Or with the complement:
tr -cd '\000-\177'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With