SQLite supports a "shared cache" for :memory:
databases when they are opened with a special URI (according to sqlite.org):
[T]he same in-memory database can be opened by two or more database connections as follows:
rc = sqlite3_open("file::memory:?cache=shared",&db);
I can take advantage of this in Python 3.4 by using the URI parameter for sqlite3.connect()
:
sqlite3.connect('file::memory:?cache=shared', uri=True)
However, I can't seem to get the same thing working for SQLAlchemy:
engine = sqlalchemy.create_engine('sqlite:///:memory:?cache=shared')
engine.connect()
...
TypeError: 'cache' is an invalid keyword argument for this function
Is there some way to get SQLAlchemy to make use of the shared cache?
Edit:
On Python 3.4, I can use the creator
argument to create_engine
to solve the problem, but the problem remains on other Python versions:
creator = lambda: sqlite3.connect('file::memory:?cache=shared', uri=True)
engine = sqlalchemy.create_engine('sqlite://', creator=creator)
engine.connect()
The great thing about SQLAlchemy is that it supports all popular database systems, including SQLite3, MySQL, PostgreSQL, Oracle, Microsoft SQL Server, etc. So let's start by creating our own wrapper library based on SQLAlchemy.
Sqlite is a database storage engine, which can be better compared with things such as MySQL, PostgreSQL, Oracle, MSSQL, etc. It is used to store and retrieve structured data from files. SQLAlchemy is a Python library that provides an object relational mapper (ORM).
SQLite provides an in-memory cache which you size according to the maximum number of database pages that you want to hold in memory at any given time. Berkeley DB also provides an in-memory cache that performs the same function as SQLite.
Overview. Usually, SQLite allows at most one writer to proceed concurrently. The BEGIN CONCURRENT enhancement allows multiple writers to process write transactions simultanously if the database is in "wal" or "wal2" mode, although the system still serializes COMMIT commands.
You should avoid passing uri=True
on older Python versions and the problem will be fixed:
import sqlite3
import sys
import sqlalchemy
DB_URI = 'file::memory:?cache=shared'
PY2 = sys.version_info.major == 2
if PY2:
params = {}
else:
params = {'uri': True}
creator = lambda: sqlite3.connect(DB_URI, **params)
engine = sqlalchemy.create_engine('sqlite:///:memory:', creator=creator)
engine.connect()
SQLAlchemy docs about the SQLite dialect describe the problem and a solution in detail:
Threading/Pooling Behavior
Pysqlite’s default behavior is to prohibit the usage of a single connection in more than one thread. This is originally intended to work with older versions of SQLite that did not support multithreaded operation under various circumstances. In particular, older SQLite versions did not allow a :memory: database to be used in multiple threads under any circumstances.
Pysqlite does include a now-undocumented flag known as
check_same_thread
which will disable this check, however note that pysqlite connections are still not safe to use in concurrently in multiple threads. In particular, any statement execution calls would need to be externally mutexed, as Pysqlite does not provide for thread-safe propagation of error messages among other things. So while even:memory:
databases can be shared among threads in modern SQLite, Pysqlite doesn’t provide enough thread-safety to make this usage worth it.SQLAlchemy sets up pooling to work with Pysqlite’s default behavior:
When a
:memory:
SQLite database is specified, the dialect by default will useSingletonThreadPool
. This pool maintains a single connection per thread, so that all access to the engine within the current thread use the same:memory:
database - other threads would access a different:memory:
database.When a file-based database is specified, the dialect will use
NullPool
as the source of connections. This pool closes and discards connections which are returned to the pool immediately. SQLite file-based connections have extremely low overhead, so pooling is not necessary. The scheme also prevents a connection from being used again in a different thread and works best with SQLite’s coarse-grained file locking.Using a Memory Database in Multiple Threads
To use a
:memory:
database in a multithreaded scenario, the same connection object must be shared among threads, since the database exists only within the scope of that connection. TheStaticPool
implementation will maintain a single connection globally, and thecheck_same_thread
flag can be passed to Pysqlite asFalse
:from sqlalchemy.pool import StaticPool engine = create_engine('sqlite://', connect_args={'check_same_thread':False}, poolclass=StaticPool)
Note that using a
:memory:
database in multiple threads requires a recent version of SQLite.
Source: https://docs.sqlalchemy.org/en/13/dialects/sqlite.html#threading-pooling-behavior
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With