Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sqlalchemy and SQLite shared cache

Tags:

SQLite supports a "shared cache" for :memory: databases when they are opened with a special URI (according to sqlite.org):

[T]he same in-memory database can be opened by two or more database connections as follows:

rc = sqlite3_open("file::memory:?cache=shared",&db);

I can take advantage of this in Python 3.4 by using the URI parameter for sqlite3.connect():

sqlite3.connect('file::memory:?cache=shared', uri=True)

However, I can't seem to get the same thing working for SQLAlchemy:

engine = sqlalchemy.create_engine('sqlite:///:memory:?cache=shared')
engine.connect()
...
TypeError: 'cache' is an invalid keyword argument for this function

Is there some way to get SQLAlchemy to make use of the shared cache?

Edit:
On Python 3.4, I can use the creator argument to create_engine to solve the problem, but the problem remains on other Python versions:

creator = lambda: sqlite3.connect('file::memory:?cache=shared', uri=True)
engine = sqlalchemy.create_engine('sqlite://', creator=creator)
engine.connect()
like image 960
bbayles Avatar asked Jan 12 '15 21:01

bbayles


People also ask

Does SQLAlchemy support SQLite?

The great thing about SQLAlchemy is that it supports all popular database systems, including SQLite3, MySQL, PostgreSQL, Oracle, Microsoft SQL Server, etc. So let's start by creating our own wrapper library based on SQLAlchemy.

Is SQLAlchemy and SQLite same?

Sqlite is a database storage engine, which can be better compared with things such as MySQL, PostgreSQL, Oracle, MSSQL, etc. It is used to store and retrieve structured data from files. SQLAlchemy is a Python library that provides an object relational mapper (ORM).

Does SQLite cache data?

SQLite provides an in-memory cache which you size according to the maximum number of database pages that you want to hold in memory at any given time. Berkeley DB also provides an in-memory cache that performs the same function as SQLite.

Does SQLite allow multiple connections?

Overview. Usually, SQLite allows at most one writer to proceed concurrently. The BEGIN CONCURRENT enhancement allows multiple writers to process write transactions simultanously if the database is in "wal" or "wal2" mode, although the system still serializes COMMIT commands.


2 Answers

You should avoid passing uri=True on older Python versions and the problem will be fixed:

import sqlite3
import sys

import sqlalchemy


DB_URI = 'file::memory:?cache=shared'
PY2 = sys.version_info.major == 2
if PY2:
    params = {}
else:
    params = {'uri': True}

creator = lambda: sqlite3.connect(DB_URI, **params)

engine = sqlalchemy.create_engine('sqlite:///:memory:', creator=creator)
engine.connect()
like image 164
Álvaro Justen Avatar answered Sep 19 '22 19:09

Álvaro Justen


SQLAlchemy docs about the SQLite dialect describe the problem and a solution in detail:

Threading/Pooling Behavior

Pysqlite’s default behavior is to prohibit the usage of a single connection in more than one thread. This is originally intended to work with older versions of SQLite that did not support multithreaded operation under various circumstances. In particular, older SQLite versions did not allow a :memory: database to be used in multiple threads under any circumstances.

Pysqlite does include a now-undocumented flag known as check_same_thread which will disable this check, however note that pysqlite connections are still not safe to use in concurrently in multiple threads. In particular, any statement execution calls would need to be externally mutexed, as Pysqlite does not provide for thread-safe propagation of error messages among other things. So while even :memory: databases can be shared among threads in modern SQLite, Pysqlite doesn’t provide enough thread-safety to make this usage worth it.

SQLAlchemy sets up pooling to work with Pysqlite’s default behavior:

  • When a :memory: SQLite database is specified, the dialect by default will use SingletonThreadPool. This pool maintains a single connection per thread, so that all access to the engine within the current thread use the same :memory: database - other threads would access a different :memory: database.

  • When a file-based database is specified, the dialect will use NullPool as the source of connections. This pool closes and discards connections which are returned to the pool immediately. SQLite file-based connections have extremely low overhead, so pooling is not necessary. The scheme also prevents a connection from being used again in a different thread and works best with SQLite’s coarse-grained file locking.

Using a Memory Database in Multiple Threads

To use a :memory: database in a multithreaded scenario, the same connection object must be shared among threads, since the database exists only within the scope of that connection. The StaticPool implementation will maintain a single connection globally, and the check_same_thread flag can be passed to Pysqlite as False:

from sqlalchemy.pool import StaticPool
engine = create_engine('sqlite://',
              connect_args={'check_same_thread':False},
              poolclass=StaticPool)

Note that using a :memory: database in multiple threads requires a recent version of SQLite.

Source: https://docs.sqlalchemy.org/en/13/dialects/sqlite.html#threading-pooling-behavior

like image 42
Hubert Grzeskowiak Avatar answered Sep 17 '22 19:09

Hubert Grzeskowiak