Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python SQLite: enforce UTF-8 encoding

I'm developing a cross-platform Python (3.7+) application, and I need to rely on sort order of TEXT columns in SQLite, meaning the comparison algorithm of TEXT values must be based on UTF-8 bytes. Even if the system encoding (sys.getdefaultencoding()) is not utf-8.

But in documentation of sqlite3 module I can't find an encoding option for sqlite3.connect.

And I read that the use of sys.setdefaultencoding("utf-8") is an ugly hack and highly discouraged (that's why we need to reload(sys) before calling it)

So what's the solution?

like image 349
saeedgnu Avatar asked Dec 10 '25 02:12

saeedgnu


1 Answers

Looking at Python's _sqlite/connection.c code, either sqlite3_open_v2 or sqlite3_open is called (depending on a compile flag). And based on sqlite doc, both of them use UTF-8 as default database encoding. I'm still not sure about the meaning of word "default" since it doesn't mention any way to override it! But I it doesn't look like that Python can open with another encoding.

#ifdef SQLITE_OPEN_URI
    Py_BEGIN_ALLOW_THREADS
    rc = sqlite3_open_v2(database, &self->db,
                         SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE |
                         (uri ? SQLITE_OPEN_URI : 0), NULL);
#else
    if (uri) {
        PyErr_SetString(pysqlite_NotSupportedError, "URIs not supported");
        return -1;
    }
    Py_BEGIN_ALLOW_THREADS
    rc = sqlite3_open(database, &self->db);
#endif
like image 184
saeedgnu Avatar answered Dec 11 '25 15:12

saeedgnu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!