Here are my attempts with error messages. What am I doing wrong?
string.decode("ascii", "ignore")
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 37: ordinal not in range(128)
string.encode('utf-8', "ignore")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 37: ordinal not in range(128)
Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
In Java, when we deal with String sometimes it is required to encode a string in a specific character set. Encoding is a way to convert data from one format to another. String objects use UTF-16 encoding. The problem with UTF-16 is that it cannot be modified.
Encoding is essentially a writing process, whereas decoding is a reading process. Encoding breaks a spoken word down into parts that are written or spelled out, while decoding breaks a written word into parts that are verbally spoken.
In computers, encoding is the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage. Decoding is the opposite process -- the conversion of an encoded format back into the original sequence of characters.
You can't decode a unicode
, and you can't encode a str
. Try doing it the other way around.
Guessing at all the things omitted from the original question, but, assuming Python 2.x the key is to read the error messages carefully: in particular where you call 'encode' but the message says 'decode' and vice versa, but also the types of the values included in the messages.
In the first example string
is of type unicode
and you attempted to decode it which is an operation converting a byte string to unicode. Python helpfully attempted to convert the unicode value to str
using the default 'ascii' encoding but since your string contained a non-ascii character you got the error which says that Python was unable to encode a unicode value. Here's an example which shows the type of the input string:
>>> u"\xa0".decode("ascii", "ignore") Traceback (most recent call last): File "<pyshell#7>", line 1, in <module> u"\xa0".decode("ascii", "ignore") UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 0: ordinal not in range(128)
In the second case you do the reverse attempting to encode a byte string. Encoding is an operation that converts unicode to a byte string so Python helpfully attempts to convert your byte string to unicode first and, since you didn't give it an ascii string the default ascii decoder fails:
>>> "\xc2".encode("ascii", "ignore") Traceback (most recent call last): File "<pyshell#6>", line 1, in <module> "\xc2".encode("ascii", "ignore") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With