Is there a built-in way in SQLite (or similar) to keep the best of both worlds SQL / NoSQL, for small projects, i.e.: <ul> <li>stored in a (flat) file like SQLite (no client/server scheme, no server to install; more precisely : nothing else to install except <code>pip install <package></code>)</li> <li>possibility to store rows as <code>dict</code>, without having a common structure for each row, like NoSQL databases</li> <li>support of simple queries</li> </ul> Example: <pre class="prettyprint"><code>db = NoSQLite('test.db') db.addrow({'name': 'john doe', 'balance': 1000, 'data': [1, 73.23, 18]}) db.addrow({'name': 'alice', 'balance': 2000, 'email': 'a@b.com'}) for row in db.find('balance > 1500'): print(row) # {'id': 'f565a9fd3a', 'name': 'alice', 'balance': 2000, 'email': 'a@b.com'} # id was auto-generated </code></pre> Note: I have constantly been amazed along the years by how many interesting features are in fact possible with SQLite in a few lines of code, that's why I'm asking if what I describe here could maybe be available simply with SQLite by using only a few SQLite core features. PS: <code>shelve</code> could look like a solution but in fact it's just a persistent key/value store, and it doesn't have query/<code>find</code> functions; also <code>bsddb</code> (BerkeleyDB for Python) looks deprecated and has no query feature with a similar API.

<h3>SQLite</h3> <ul> <li> <code>JSON1</code> extension and <code>json_extract</code> (see accepted answer). Example: <pre class="prettyprint"><code>import sqlite3, json # tested with precompiled Windows binaries from https://www.sqlite.org/download.html (sqlite3.dll copied in C:\Python37\DLLs) class sqlitenosql: def __init__(self, f): self.db = sqlite3.connect(f) self.db.execute('CREATE TABLE test(data TEXT);') def close(self): self.db.commit() self.db.close() def addrow(self, d): self.db.execute("INSERT INTO test VALUES (?);", (json.dumps(d),)) def find(self, query): for k, v in query.items(): if isinstance(v, str): query[k] = f"'{v}'" q = ' AND '.join(f" json_extract(data, '$.{k}') = {v}" for k, v in query.items()) for r in self.db.execute(f"SELECT * FROM test WHERE {q}"): yield r[0] db = sqlitenosql(':memory:') db.addrow({'name': 'john', 'balance': 1000, 'data': [1, 73.23, 18], 'abc': 'hello'}) db.addrow({'name': 'alice', 'balance': 2000, 'email': 'a@b.com'}) db.addrow({'name': 'bob', 'balance': 1000}) db.addrow({'name': 'richard', 'balance': 1000, 'abc': 'hello'}) for r in db.find({'balance': 1000, 'abc': 'hello'}): print(r) # {"name": "john", "balance": 1000, "data": [1, 73.23, 18], "abc": "hello"} # {"name": "richard", "balance": 1000, "abc": "hello"} db.close() </code></pre> </li> <li> sqlitedict as mentioned in Key: value store in Python for possibly 100 GB of data, without client/server and Use SQLite as a key:value store with: key = an ID value = the dict we want to store, e.g. <code>{'name': 'alice', 'balance': 2000, 'email': 'a@b.com'}</code> </li> <li>Further reading about use of SQLite with JSON: https://community.esri.com/groups/appstudio/blog/2018/08/21/working-with-json-in-sqlite-databases</li> </ul> <h3>TinyDB</h3> TinyDB looks like a good solution: <pre class="prettyprint"><code>>>> from tinydb import TinyDB, Query >>> db = TinyDB('path/to/db.json') >>> User = Query() >>> db.insert({'name': 'John', 'age': 22}) >>> db.search(User.name == 'John') [{'name': 'John', 'age': 22}] </code></pre> However, the documentation mentions that it's not the right tool if we need: <blockquote> <ul> <li>access from multiple processes or threads,</li> <li>creating indexes for tables,</li> <li>an HTTP server,</li> <li>managing relationships between tables or similar,</li> <li>ACID guarantees</li> </ul> </blockquote> So it's a half solution :) <h3>Oher solutions</h3> Seems interesting too : WhiteDB

It's possible via using the JSON1 extension to query JSON data stored in a column, yes: <pre class="prettyprint"><code>sqlite> CREATE TABLE test(data TEXT); sqlite> INSERT INTO test VALUES ('{"name":"john doe","balance":1000,"data":[1,73.23,18]}'); sqlite> INSERT INTO test VALUES ('{"name":"alice","balance":2000,"email":"a@b.com"}'); sqlite> SELECT * FROM test WHERE json_extract(data, '$.balance') > 1500; data -------------------------------------------------- {"name":"alice","balance":2000,"email":"a@b.com"} </code></pre> <hr> If you're going to be querying the same field a lot, you can make it more efficient by adding an index on the expression: <pre class="prettyprint"><code>CREATE INDEX test_idx_balance ON test(json_extract(data, '$.balance')); </code></pre> will use that index on the above query instead of scanning every single row.

Flat file NoSQL solution [closed]

Tags:

python

sql

database

sqlite

nosql

Is there a built-in way in SQLite (or similar) to keep the best of both worlds SQL / NoSQL, for small projects, i.e.:

stored in a (flat) file like SQLite (no client/server scheme, no server to install; more precisely : nothing else to install except pip install <package>)
possibility to store rows as dict, without having a common structure for each row, like NoSQL databases
support of simple queries

Example:

db = NoSQLite('test.db')
db.addrow({'name': 'john doe', 'balance': 1000, 'data': [1, 73.23, 18]})
db.addrow({'name': 'alice', 'balance': 2000, 'email': '[email protected]'})
for row in db.find('balance > 1500'):
    print(row)

# {'id': 'f565a9fd3a', 'name': 'alice', 'balance': 2000, 'email': '[email protected]'}   # id was auto-generated

Note: I have constantly been amazed along the years by how many interesting features are in fact possible with SQLite in a few lines of code, that's why I'm asking if what I describe here could maybe be available simply with SQLite by using only a few SQLite core features.

PS: shelve could look like a solution but in fact it's just a persistent key/value store, and it doesn't have query/find functions; also bsddb (BerkeleyDB for Python) looks deprecated and has no query feature with a similar API.

395

asked Apr 07 '20 19:04

Basj

Video Answer

2 Answers

SQLite

JSON1 extension and json_extract (see accepted answer). Example:

import sqlite3, json  # tested with precompiled Windows binaries from https://www.sqlite.org/download.html (sqlite3.dll copied in C:\Python37\DLLs)

class sqlitenosql:
    def __init__(self, f):
        self.db = sqlite3.connect(f)
        self.db.execute('CREATE TABLE test(data TEXT);')

    def close(self):
        self.db.commit()
        self.db.close()

    def addrow(self, d):
        self.db.execute("INSERT INTO test VALUES (?);", (json.dumps(d),))

    def find(self, query):
        for k, v in query.items():
            if isinstance(v, str):
                query[k] = f"'{v}'"
        q = ' AND '.join(f" json_extract(data, '$.{k}') = {v}" for k, v in query.items())
        for r in self.db.execute(f"SELECT * FROM test WHERE {q}"):
            yield r[0]

db = sqlitenosql(':memory:')
db.addrow({'name': 'john', 'balance': 1000, 'data': [1, 73.23, 18], 'abc': 'hello'})
db.addrow({'name': 'alice', 'balance': 2000, 'email': '[email protected]'})
db.addrow({'name': 'bob', 'balance': 1000})
db.addrow({'name': 'richard', 'balance': 1000, 'abc': 'hello'})
for r in db.find({'balance': 1000, 'abc': 'hello'}):
    print(r)
# {"name": "john", "balance": 1000, "data": [1, 73.23, 18], "abc": "hello"}
# {"name": "richard", "balance": 1000, "abc": "hello"}    
db.close()

sqlitedict as mentioned in Key: value store in Python for possibly 100 GB of data, without client/server and Use SQLite as a key:value store with:

key = an ID

value = the dict we want to store, e.g. {'name': 'alice', 'balance': 2000, 'email': '[email protected]'}
Further reading about use of SQLite with JSON: https://community.esri.com/groups/appstudio/blog/2018/08/21/working-with-json-in-sqlite-databases

TinyDB

TinyDB looks like a good solution:

>>> from tinydb import TinyDB, Query
>>> db = TinyDB('path/to/db.json')
>>> User = Query()
>>> db.insert({'name': 'John', 'age': 22})
>>> db.search(User.name == 'John')
[{'name': 'John', 'age': 22}]

However, the documentation mentions that it's not the right tool if we need:

access from multiple processes or threads,

creating indexes for tables,

an HTTP server,

managing relationships between tables or similar,

ACID guarantees

So it's a half solution :)

Oher solutions

Seems interesting too : WhiteDB

102

answered Oct 23 '22 05:10

Basj

It's possible via using the JSON1 extension to query JSON data stored in a column, yes:

sqlite> CREATE TABLE test(data TEXT);
sqlite> INSERT INTO test VALUES ('{"name":"john doe","balance":1000,"data":[1,73.23,18]}');
sqlite> INSERT INTO test VALUES ('{"name":"alice","balance":2000,"email":"[email protected]"}');
sqlite> SELECT * FROM test WHERE json_extract(data, '$.balance') > 1500;
data
--------------------------------------------------
{"name":"alice","balance":2000,"email":"[email protected]"}

If you're going to be querying the same field a lot, you can make it more efficient by adding an index on the expression:

CREATE INDEX test_idx_balance ON test(json_extract(data, '$.balance'));

will use that index on the above query instead of scanning every single row.

answered Oct 23 '22 04:10

Shawn

Related questions
                            
                                How to create mask images from COCO dataset?
                            
                                Tensorflow InvalidArgumentError (indices) while training with Keras
                            
                                Plotting two histograms from a pandas DataFrame in one subplot using matplotlib
                            
                                Plot importance variables xgboost Python
                            
                                pandas groupby aggregate element-wise list addition
                            
                                how to connect to region in boto3
                            
                                Change color of missing values in Seaborn heatmap
                            
                                Return two data frames from a function with data frame format
                            
                                How to write CUSTOM metadata into JPEG with Python?
                            
                                Gunicorn won't start Flask app because "Application object must be callable"
                            
                                Downloading dynamically generated files from a Dash/Flask app
                            
                                Provide a path to gdal-config using a GDAL_CONFIG environment variable error while attempting to install Fiona
                            
                                Pandas group the rows in a dataframe based on specific column value
                            
                                How to sort a set in python? [duplicate]
                            
                                What is the difference between pywin32 and pypiwin32?
                            
                                Python Pandas Expand a Column of List of Lists to Two New Column
                            
                                Why is NumPy sometimes slower than NumPy + plain Python loop?
                            
                                How to handle an AnalysisException on Spark SQL?
                            
                                Django ImportError: cannot import name 'python_2_unicode_compatible'
                            
                                Pandas Dataframe - Bin on multiple columns & get statistics on another column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Flat file NoSQL solution [closed]

Tags:

python

sql

database

sqlite

nosql

Basj

People also ask

Video Answer

2 Answers

SQLite

TinyDB

Oher solutions

Basj

Shawn

Recent Activity

Donate For Us