i need to remove all multibyte characters from a file, i dont know what they are so i need to cover the whole range.
I can find them using grep like so: grep -P "[\x80-\xFF]" 'myfile'
Trying to do a simular thing with sed, but delete them instead.
Cheers
Deleting line using sed To delete a line, we'll use the sed “d” command. Note that you have to declare which line to delete. Otherwise, sed will delete all the lines.
Each byte sequence represents a single character in the extended character set. Multibyte characters are used in character sets such as Kanji. Wide characters are multilingual character codes that are always 16 bits wide. The type for character constants is char ; for wide characters, the type is wchar_t .
Give this a try:
LANG=C sed 's/[\x80-\xFF]//g' filename
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With