How can a text string be turned into UTF-8 encoded bytes using Bash and/or common Linux command line utilities? For example, in Python one would do:
"Six of one, ½ dozen of the other".encode('utf-8')
b'Six of one, \xc2\xbd dozen of the other'
Is there a way to do this in pure Bash:
STR="Six of one, ½ dozen of the other"
<utility_or_bash_command_here> --encoding='utf-8' $STR
'Six of one, \xc2\xbd dozen of the other'
Python to the rescue!
alias encode='python3 -c "from sys import stdin; print(stdin.read().encode(\"utf-8\"))"'
root@kali-linux:~# echo "½ " | encode
b'\xc2\xbd \n'
Also, you can remove b''
with some sed/awk thingy if you want.
Perl to the rescue!
echo "$STR" | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
The /e
modifier allows to include code into the replacement part of the s///
substitution, which in this case converts ord to hex via sprintf.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With