How to convert a string to utf-8 in Python

In Python 2

>>> plain_string = "Hi!"
>>> unicode_string = u"Hi!"
>>> type(plain_string), type(unicode_string)
(<type 'str'>, <type 'unicode'>)

^ This is the difference between a byte string (plain_string) and a unicode string.

>>> s = "Hello!"
>>> u = unicode(s, "utf-8")

^ Converting to unicode and specifying the encoding.

In Python 3

All strings are unicode. The unicode function does not exist anymore. See answer from @Noumenon

If the methods above don't work, you can also tell Python to ignore portions of a string that it can't convert to utf-8:

stringnamehere.decode('utf-8', 'ignore')

Might be a bit overkill, but when I work with ascii and unicode in same files, repeating decode can be a pain, this is what I use:

def make_unicode(inp):
    if type(inp) != unicode:
        inp =  inp.decode('utf-8')
    return inp

Adding the following line to the top of your .py file:

# -*- coding: utf-8 -*-

allows you to encode strings directly in your script, like this:

utfstr = "ボールト"

Related questions
                            
                                How to make good reproducible pandas examples
                            
                                Class has no objects member
                            
                                Split a string by a delimiter in python
                            
                                How to obtain a Thread id in Python?
                            
                                How do I format a date in Jinja2?
                            
                                _tkinter.TclError: no display name and no $DISPLAY environment variable
                            
                                How to get different colored lines for different plots in a single figure?
                            
                                Max retries exceeded with URL in requests
                            
                                How do I retrieve the number of columns in a Pandas data frame?
                            
                                Create a .csv file with values from a Python list
                            
                                What is the syntax to insert one list into another list in python?
                            
                                NameError: name 'reduce' is not defined in Python
                            
                                What can you use Python generator functions for?
                            
                                Safe method to get value of nested dictionary
                            
                                ImportError: No module named pip
                            
                                Getting distance between two points based on latitude/longitude
                            
                                Why does Python code use len() function instead of a length method?
                            
                                How to convert integer timestamp to Python datetime
                            
                                Convert [key1,val1,key2,val2] to a dict?
                            
                                Python function attributes - uses and abuses [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to convert a string to utf-8 in Python

Tags:

python

unicode

utf-8

python-2.7

People also ask

In Python 2

In Python 3

Recent Activity

Donate For Us