Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings

Using SQLite3 in Python, I am trying to store a compressed version of a snippet of UTF-8 HTML code.

Code looks like this:

...
c = connection.cursor()
c.execute('create table blah (cid integer primary key,html blob)')
...
c.execute('insert or ignore into blah values (?, ?)',(cid, zlib.compress(html)))

At which point at get the error:

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

If I use 'text' rather than 'blob' and don't compress the HTML snippet, it works all fine (db is to large though). When I use 'blob' and compress via Python zlib library, I get the above error message. I looked around but couldn't find a simple answer for this one.

like image 561
R. Hill Avatar asked Aug 06 '10 15:08

R. Hill


4 Answers

If you want to use 8-bit strings instead of unicode string in sqlite3, set appropriate text_factory for sqlite connection:

connection = sqlite3.connect(...)
connection.text_factory = str
like image 168
zag Avatar answered Oct 15 '22 18:10

zag


Found the solution, I should have spent just a little more time searching.

Solution is to 'cast' the value as a Python 'buffer', like so:

c.execute('insert or ignore into blah values (?, ?)',(cid, buffer(zlib.compress(html))))

Hopefully this will help somebody else.

like image 41
R. Hill Avatar answered Oct 15 '22 18:10

R. Hill


In order to work with the BLOB type, you must first convert your zlib compressed string into binary data - otherwise sqlite will try to process it as a text string. This is done with sqlite3.Binary(). For example:

c.execute('insert or ignore into blah values (?, ?)',(cid, 
sqlite3.Binary(zlib.compress(html))))
like image 25
MarioVilas Avatar answered Oct 15 '22 20:10

MarioVilas


Syntax:

5 types of possible storage: NULL, INTEGER, TEXT, REAL and BLOB

BLOB is generally used to store pickled models or dill pickled models

> cur.execute('''INSERT INTO Tablename(Col1, Col2, Col3, Col4) VALUES(?,?,?,?)''', 
                                      [TextValue, Real_Value, Buffer(model), sqlite3.Binary(model2)])
> conn.commit()

> # Read Data:
> df = pd.read_sql('SELECT * FROM Model, con=conn) 
> model1 = str(df['Col3'].values[0]))
> model2 = str(df['Col'].values[0]))
like image 42
Pranzell Avatar answered Oct 15 '22 20:10

Pranzell