I have searched many times online and I have not been able to find a way to convert my binary string variable, X
X = "1000100100010110001101000001101010110011001010100"
into a UTF-8 string value.
I have found that some people are using methods such as
b'message'.decode('utf-8')
however, this method has not worked for me, as 'b' is said to be nonexistent, and I am not sure how to replace the 'message' with a variable. Not only, but I have not been able to comprehend how this method works. Is there a better alternative?
So how could I convert a binary string into a text string?
EDIT: I also do not mind ASCII decoding
CLARIFICATION: Here is specifically what I would like to happen.
def binaryToText(z):
# Some code to convert binary to text
return (something here);
X="0110100001101001"
print binaryToText(X)
This would then yield the string...
hi
Method #1: The binary data is divided into sets of 7 bits because this set of binary as input, returns the corresponding decimal value which is ASCII code of the character of a string. This ASCII code is then converted to string using chr() function.
To convert an integer to string in Python, use the str() function. This function takes any data type and converts it into a string, including integers. Use the syntax print(str(INT)) to return the int as a str , or string.
Python split() method is used to split the string into chunks, and it accepts one argument called separator. A separator can be any character or a symbol. If no separators are defined, then it will split the given string and whitespace will be used by default.
It looks like you are trying to decode ASCII characters from a binary string representation (bit string) of each character.
You can take each block of eight characters (a byte), convert that to an integer, and then convert that to a character with chr()
:
>>> X = "0110100001101001"
>>> print(chr(int(X[:8], 2)))
h
>>> print(chr(int(X[8:], 2)))
i
Assuming that the values encoded in the string are ASCII this will give you the characters. You can generalise it like this:
def decode_binary_string(s):
return ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))
>>> decode_binary_string(X)
hi
If you want to keep it in the original encoding you don't need to decode any further. Usually you would convert the incoming string into a Python unicode string and that can be done like this (Python 2):
def decode_binary_string(s, encoding='UTF-8'):
byte_string = ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))
return byte_string.decode(encoding)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With