What is the difference between encode/decode?

People also ask

What is the difference between encoding and decoding with example?

Encoding means the creation of a messages (which you want to communicate with other person). On the other hand decoding means listener or audience of encoded message. So decoding means interpreting the meaning of the message. For example a breakfast cereal company want to convey their message to you to buy its product.

What is the difference between encode and decode in reading?

Encoding is the process of hearing a sound and being able to write a symbol to represent that sound. Decoding is the opposite: it involves seeing a written symbol and being able to say what sound it represents.

What is coding decoding and encoding?

What is Encoding and Decoding? Encoding is the process of putting a sequence of characters such as letters, numbers and other special characters into a specialized format for efficient transmission. Decoding is the process of converting an encoded format back into the original sequence of characters.

What is the difference between encode and decode Python?

To represent a unicode string as a string of bytes is known as encoding. To convert a string of bytes to a unicode string is known as decoding.

The decode method of unicode strings really doesn't have any applications at all (unless you have some non-text data in a unicode string for some reason -- see below). It is mainly there for historical reasons, i think. In Python 3 it is completely gone.

unicode().decode() will perform an implicit encoding of s using the default (ascii) codec. Verify this like so:

>>> s = u'ö'
>>> s.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0:
ordinal not in range(128)

>>> s.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0:
ordinal not in range(128)

The error messages are exactly the same.

For str().encode() it's the other way around -- it attempts an implicit decoding of s with the default encoding:

>>> s = 'ö'
>>> s.decode('utf-8')
u'\xf6'
>>> s.encode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0:
ordinal not in range(128)

Used like this, str().encode() is also superfluous.

But there is another application of the latter method that is useful: there are encodings that have nothing to do with character sets, and thus can be applied to 8-bit strings in a meaningful way:

>>> s.encode('zip')
'x\x9c;\xbc\r\x00\x02>\x01z'

You are right, though: the ambiguous usage of "encoding" for both these applications is... awkard. Again, with separate byte and string types in Python 3, this is no longer an issue.

To represent a unicode string as a string of bytes is known as encoding. Use u'...'.encode(encoding).

Example:

    >>> u'æøå'.encode('utf8')
    '\xc3\x83\xc2\xa6\xc3\x83\xc2\xb8\xc3\x83\xc2\xa5'
    >>> u'æøå'.encode('latin1')
    '\xc3\xa6\xc3\xb8\xc3\xa5'
    >>> u'æøå'.encode('ascii')
    UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: 
    ordinal not in range(128)

You typically encode a unicode string whenever you need to use it for IO, for instance transfer it over the network, or save it to a disk file.

To convert a string of bytes to a unicode string is known as decoding. Use unicode('...', encoding) or '...'.decode(encoding).

Example:

   >>> u'æøå'
   u'\xc3\xa6\xc3\xb8\xc3\xa5' # the interpreter prints the unicode object like so
   >>> unicode('\xc3\xa6\xc3\xb8\xc3\xa5', 'latin1')
   u'\xc3\xa6\xc3\xb8\xc3\xa5'
   >>> '\xc3\xa6\xc3\xb8\xc3\xa5'.decode('latin1')
   u'\xc3\xa6\xc3\xb8\xc3\xa5'

You typically decode a string of bytes whenever you receive string data from the network or from a disk file.

I believe there are some changes in unicode handling in python 3, so the above is probably not correct for python 3.

Some good links:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Unicode HOWTO

anUnicode.encode('encoding') results in a string object and can be called on a unicode object

aString.decode('encoding') results in an unicode object and can be called on a string, encoded in given encoding.

Some more explanations:

You can create some unicode object, which doesn't have any encoding set. The way it is stored by Python in memory is none of your concern. You can search it, split it and call any string manipulating function you like.

But there comes a time, when you'd like to print your unicode object to console or into some text file. So you have to encode it (for example - in UTF-8), you call encode('utf-8') and you get a string with '\u<someNumber>' inside, which is perfectly printable.

Then, again - you'd like to do the opposite - read string encoded in UTF-8 and treat it as an Unicode, so the \u360 would be one character, not 5. Then you decode a string (with selected encoding) and get brand new object of the unicode type.

Just as a side note - you can select some pervert encoding, like 'zip', 'base64', 'rot' and some of them will convert from string to string, but I believe the most common case is one that involves UTF-8/UTF-16 and string.

mybytestring.encode(somecodec) is meaningful for these values of somecodec:

base64
bz2
zlib
hex
quopri
rot13
string_escape
uu

I am not sure what decoding an already decoded unicode text is good for. Trying that with any encoding seems to always try to encode with the system's default encoding first.

Related questions
                            
                                Is there something like RStudio for Python? [closed]
                            
                                BaseException.message deprecated in Python 2.6
                            
                                Random state (Pseudo-random number) in Scikit learn
                            
                                How to get a reference to a module inside the module itself?
                            
                                DeprecationWarning: invalid escape sequence - what to use instead of \d?
                            
                                Should conda, or conda-forge be used for Python environments?
                            
                                enum - getting value of enum on string conversion
                            
                                How do I create a datetime in Python from milliseconds?
                            
                                How do I tell Matplotlib to create a second (new) plot, then later plot on the old one?
                            
                                When to use Tornado, when to use Twisted / Cyclone / GEvent / other [closed]
                            
                                How are booleans formatted in Strings in Python?
                            
                                How do I write output in same place on the console?
                            
                                Generate random numbers with a given (numerical) distribution
                            
                                How to sort two lists (which reference each other) in the exact same way
                            
                                Numpy matrix to array
                            
                                Asynchronous Requests with Python requests
                            
                                How to round a number to significant figures in Python
                            
                                How to get the latest file in a folder?
                            
                                How to disable logging on the standard error stream?
                            
                                List of tables, db schema, dump etc using the Python sqlite3 API

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between encode/decode?

Tags:

python

string

character-encoding

unicode

python-2.x

People also ask

Recent Activity

Donate For Us