I want to remove Unicode in some range, e.g.:
echo "abcABC123" | sed 's/[\uff21-\uff3b]//g'
expect "abc123"
, but get:
sed: -e expression #1, char 20: Invalid range end
or use:
echo "abcABC123" | sed 's/[A-Z]//g'
get:
sed: -e expression #1, char 14: Invalid collation character
Unicode support in sed
is not well defined. You may be better off using command line perl
:
echo "abcABC123" | perl -CS -pe 's/[\x{FF21}-\x{FF3B}]+//g'
abc123
It is important to use -CS
flags here to be able to get correct UTF8 encodings for input/output/error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With