I want to remove Unicode in some range, e.g.:
echo "abcABC123" | sed 's/[\uff21-\uff3b]//g'
expect "abc123", but get:
sed: -e expression #1, char 20: Invalid range end
or use:
echo "abcABC123" | sed 's/[A-Z]//g'
get:
sed: -e expression #1, char 14: Invalid collation character
Unicode support in sed is not well defined. You may be better off using command line perl:
echo "abcABC123" | perl -CS -pe 's/[\x{FF21}-\x{FF3B}]+//g'
abc123
It is important to use -CS flags here to be able to get correct UTF8 encodings for input/output/error. 
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With