Python's sqlite3 :memory:
option provides speedier queries and updates than the equivalent on-disk database. How can I load a disk-based database into memory, do fast operations on it, and then write the updated version back to disk?
The question How to browse an in memory sqlite database in python seems related but it focuses on how to use a disk-based browsing tool on an in-memory db. The question How can I copy an in-memory SQLite database to another in-memory SQLite database in Python? is also related but it is specific to Django.
My current solution is to read all of the tables, one-at-a-time, from the disk-based database into lists of tuples, then manually recreate the entire database schema for the in-memory db, and then load the data from the lists of tuples into the in-memory db. After operating on the data, the process is reversed.
There must be a better way!
An in-memory database keeps all its data in the random access memory (RAM) of a computer. Only the main memory is accessed when querying data. This allows for faster access of that data than a disk-based system.
RAM is 100 Thousand Times Faster than Disk for Database Access.
an on-disk database stores the data on disk and uses memory for caching. Of course, we all know memory (RAM) is much (multiple order of magnitude) faster than disk, so the advantage of an in-memory database is clear.
The answer at How to load existing db file to memory in Python sqlite3? provided the important clues. Building on that answer, here is a simplification and generalization of that code.
It eliminates eliminate the unnecessary use of StringIO and is packaged into reusable form for both reading into and writing from an in-memory database.
import sqlite3
def copy_database(source_connection, dest_dbname=':memory:'):
'''Return a connection to a new copy of an existing database.
Raises an sqlite3.OperationalError if the destination already exists.
'''
script = ''.join(source_connection.iterdump())
dest_conn = sqlite3.connect(dest_dbname)
dest_conn.executescript(script)
return dest_conn
if __name__ == '__main__':
from contextlib import closing
with closing(sqlite3.connect('pepsearch.db')) as disk_db:
mem_db = copy_database(disk_db)
mem_db.execute('DELETE FROM documents WHERE uri="pep-3154"')
mem_db.commit()
copy_database(mem_db, 'changed.db').close()
Frankly, I wouldn't fool around too much with in-memory databases, unless you really do need an indexed structure that you know will always fit entirely within available memory. SQLite is extremely smart about its I/O, especially when you wrap everything (including reads ...) into transactions, as you should. It will very efficiently keep things in memory as it is manipulating data structures that fundamentally live on external storage, and yet it will never exhaust memory (nor, take too much of it). I think that RAM really does work better as "a buffer" instead of being the primary place where data is stored ... especially in a virtual storage environment, where everything must be considered as "backed by external storage anyway."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With