I have a problem with encoding of the path variable and inserting it to the SQLite database. I tried to solve it with encode("utf-8") function which didn't help. Then I used unicode() function which gives me type unicode. <pre class="prettyprint"><code>print type(path) # <type 'unicode'> path = path.replace("one", "two") # <type 'str'> path = path.encode("utf-8") # <type 'str'> strange path = unicode(path) # <type 'unicode'> </code></pre> Finally I gained unicode type, but I still have the same error which was present when the type of the path variable was str <blockquote> sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. </blockquote> Could you help me solve this error and explain the correct usage of <code>encode("utf-8")</code> and <code>unicode()</code> functions? I'm often fighting with it. EDIT: This execute() statement raised the error: <pre class="prettyprint"><code>cur.execute("update docs set path = :fullFilePath where path = :path", locals()) </code></pre> I forgot to change the encoding of fullFilePath variable which suffers with the same problem, but I'm quite confused now. Should I use only unicode() or encode("utf-8") or both? I can't use <pre class="prettyprint"><code>fullFilePath = unicode(fullFilePath.encode("utf-8")) </code></pre> because it raises this error: <blockquote> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 32: ordinal not in range(128) </blockquote> Python version is 2.7.2

<code>str</code> is text representation in bytes, <code>unicode</code> is text representation in characters. You decode text from bytes to unicode and encode a unicode into bytes with some encoding. That is: <pre class="prettyprint"><code>>>> 'abc'.decode('utf-8') # str to unicode u'abc' >>> u'abc'.encode('utf-8') # unicode to str 'abc' </code></pre> UPD Sep 2020: The answer was written when Python 2 was mostly used. In Python 3, <code>str</code> was renamed to <code>bytes</code>, and <code>unicode</code> was renamed to <code>str</code>. <pre class="prettyprint"><code>>>> b'abc'.decode('utf-8') # bytes to str 'abc' >>> 'abc'.encode('utf-8'). # str to bytes b'abc' </code></pre>

Usage of unicode() and encode() functions in Python

Tags:

python

string

sqlite

encoding

unicode

I have a problem with encoding of the path variable and inserting it to the SQLite database. I tried to solve it with encode("utf-8") function which didn't help. Then I used unicode() function which gives me type unicode.

print type(path)                  # <type 'unicode'> path = path.replace("one", "two") # <type 'str'> path = path.encode("utf-8")       # <type 'str'> strange path = unicode(path)              # <type 'unicode'>

Finally I gained unicode type, but I still have the same error which was present when the type of the path variable was str

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

Could you help me solve this error and explain the correct usage of encode("utf-8") and unicode() functions? I'm often fighting with it.

EDIT:

This execute() statement raised the error:

cur.execute("update docs set path = :fullFilePath where path = :path", locals())

I forgot to change the encoding of fullFilePath variable which suffers with the same problem, but I'm quite confused now. Should I use only unicode() or encode("utf-8") or both?

I can't use

fullFilePath = unicode(fullFilePath.encode("utf-8"))

because it raises this error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 32: ordinal not in range(128)

Python version is 2.7.2

692

asked Apr 23 '12 20:04

xralf

1 Answers

str is text representation in bytes, unicode is text representation in characters.

You decode text from bytes to unicode and encode a unicode into bytes with some encoding.

That is:

>>> 'abc'.decode('utf-8')  # str to unicode u'abc' >>> u'abc'.encode('utf-8') # unicode to str 'abc'

UPD Sep 2020: The answer was written when Python 2 was mostly used. In Python 3, str was renamed to bytes, and unicode was renamed to str.

>>> b'abc'.decode('utf-8') # bytes to str 'abc' >>> 'abc'.encode('utf-8'). # str to bytes b'abc'

answered Oct 07 '22 02:10

newtover

Related questions
                            
                                Prepend line to beginning of a file
                            
                                How to close a SQLAlchemy session?
                            
                                How to get value counts for multiple columns at once in Pandas DataFrame?
                            
                                Create and import helper functions in tests without creating packages in test directory using py.test
                            
                                Convert string to ASCII value python
                            
                                How to stop a looping thread in Python?
                            
                                manage.py runserver
                            
                                How can I download all emails with attachments from Gmail?
                            
                                How do I include image files in Django templates?
                            
                                How to write XML declaration using xml.etree.ElementTree
                            
                                python/zip: How to eliminate absolute path in zip archive if absolute paths for files are provided?
                            
                                Can I debug with python debugger when using py.test somehow?
                            
                                Fastest way to convert a dict's keys & values from `unicode` to `str`?
                            
                                "Ask forgiveness not permission" - explain
                            
                                Getting values from JSON using Python
                            
                                Ping a site in Python?
                            
                                How do I convert a list of ascii values to a string in python?
                            
                                Python: load variables in a dict into namespace
                            
                                Summing elements in a list
                            
                                TypeError: 'list' object is not callable in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With