I'm using SQLAlchemy and MySQL, with a files
table to store files. That table is defined as follows:
mysql> show full columns in files;
+---------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+---------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
| id | varchar(32) | utf8_general_ci | NO | PRI | NULL | | select,insert,update,references | |
| created | datetime | NULL | YES | | NULL | | select,insert,update,references | |
| updated | datetime | NULL | YES | | NULL | | select,insert,update,references | |
| content | mediumblob | NULL | YES | | NULL | | select,insert,update,references | |
| name | varchar(500) | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
+---------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
The content column of type MEDIUMBLOB
is where the files are stored. In SQLAlchemy that column is declared as:
__maxsize__ = 12582912 # 12MiB
content = Column(LargeBinary(length=__maxsize__))
I am not quite sure about the difference between SQLAlchemy's BINARY
type and LargeBinary
type. Or the difference between MySQL's VARBINARY
type and BLOB
type. And I am not quite sure if that matters here.
Question: Whenever I store an actual binary file in that table, i.e. a Python bytes
or b''
object , then I get the following warning
.../python3.4/site-packages/sqlalchemy/engine/default.py:451: Warning: Invalid utf8 character string: 'BCB121'
cursor.execute(statement, parameters)
I don't want to just ignore the warning, but it seems that the files are in tact. How do I handle this warning gracefully, how can I fix its cause?
Side note: This question seems to be related, and it seems to be a MySQL bug that it tries to convert all incoming data to UTF-8 (this answer).
Turns out that this was a driver issue. Apparently the default MySQL driver stumbles with Py3 and utf8 support. Installing cymysql into the virtual Python environment resolved this problem and the warnings disappear.
The fix: Find out if MySQL connects through socket or port (see here), and then modify the connection string accordingly. In my case using a socket connection:
mysql+cymysql://user:pwd@localhost/database?unix_socket=/var/run/mysqld/mysqld.sock
Use the port
argument otherwise.
Edit: While the above fixed the encoding issue, it gave rise to another one: blob size. Due to a bug in CyMySQL blobs larger than 8M fail to commit. Switching to PyMySQL fixed that problem, although it seems to have a similar issue with large blobs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With