Json received from the server has this form. <pre class="prettyprint"><code>[ { "id": 1103333, "name": "James", "tagA": [ "apple", "orange", "grape" ], "tagB": [ "red", "green", "blue" ], "tagC": null }, { "id": 1103336, "name": "John", "tagA": [ "apple", "pinapple", "melon" ], "tagB": [ "black", "white", "blue" ], "tagC": [ "London", "New York" ] } ] </code></pre> An object can have multiple tags, and a tag can be associated with multiple objects. In this list, I want to find an object whose tagA is apple or grape and tagB is black. This is the first table I used to write. <pre class="prettyprint"><code>create table response(id integer primary key, name text not null, tagA text, tagB text, tagC text) select * from response where (tagA like '%apple%' or tagA like '%grape%') and (tagB like '%black%') </code></pre> This type of table design has a problem that the search speed is very slow because it does not support the surface function of the fts function when using ORM library such as Room. The next thing I thought about was to create a table for each tag. <pre class="prettyprint"><code>create table response(id integer primary key, name text not null) create table tagA(objectID integer, value text, primary key(objectID, value)) create table tagB(objectID integer, value text, primary key(objectID, value)) create table tagC(objectID integer, value text, primary key(objectID, value)) select * from response where id in ((select objectId from tagA where value in ('apple','grape')) intersect (select objectId from tagB where value in 'black')) </code></pre> This greatly increases the insertion time and the capacity of the APK (roughly twice as much per additional table), but the search speed is far behind that of the FTS virtual table. I want to avoid this as much as I use FTS tables because there are more things I need to manage myself. There are a lot of things I missed (index etc.) but I can not figure out what it is. How can I optimize the database without using the FTS method?

You could use a reference table (aka mapping table along with a multitude of other names) to allow a many-many relationship between tags (single table for all) and objects (again single table). So you have the objects table each object having an id and you have the tags table again with an id for each object. So something like :- <pre class="prettyprint"><code>DROP TABLE IF EXISTS object_table; CREATE TABLE IF NOT EXISTS object_table (id INTEGER PRIMARY KEY, object_name); DROP TABLE IF EXISTS tag_table; CREATE TABLE IF NOT EXISTS tag_table (id INTEGER PRIMARY KEY, tag_name); </code></pre> You'd populate both e.g. <pre class="prettyprint"><code>INSERT INTO object_table (object_name) VALUES ('Object1'),('Object2'),('Object3'),('Object4'); INSERT INTO tag_table (tag_name) VALUES ('Apple'),('Orange'),('Grape'),('Pineapple'),('Melon'), ('London'),('New York'),('Paris'), ('Red'),('Green'),('Blue'); -- and so on </code></pre> The you'd have the mapping table something like :- <pre class="prettyprint"><code>DROP TABLE IF EXISTS object_tag_mapping; CREATE TABLE IF NOT EXISTS object_tag_mapping (object_reference INTEGER, tag_reference INTEGER); </code></pre> Overtime as tags are assigned to objects or vice-versa you add the mappings e.g. :- <pre class="prettyprint"><code>INSERT INTO object_tag_mapping VALUES (1,4), -- obj1 has tag Pineapple (1,1), -- obj1 has Apple (1,8), -- obj1 has Paris (1,10), -- obj1 has green (4,1),(4,3),(4,11), -- some tags for object 4 (2,8),(2,7),(2,4), -- some tags for object 2 (3,1),(3,2),(3,3),(3,4),(3,5),(3,6),(3,7),(3,8),(3,9),(3,10),(3,11); -- all tags for object 3 </code></pre> You could then have queries such as :- <pre class="prettyprint"><code>SELECT object_name, group_concat(tag_name,' ~ ') AS tags_for_this_object FROM object_tag_mapping JOIN object_table ON object_reference = object_table.id JOIN tag_table ON tag_reference = tag_table.id GROUP BY object_name ; </code></pre> <ul> <li>group_concat is an aggregate function (applied per GROUP) that concatenates all values found for the specified column with (optional) separator.</li> </ul> The result of the query being :- <img src="https://i.stack.imgur.com/kCXpI.jpg" alt="enter image description here"> The following could be a search based upon tags (not that you'd likely use both tag_name and a tag_reference) :- <pre class="prettyprint"><code>SELECT object_name, tag_name FROM object_tag_mapping JOIN object_table ON object_reference = object_table.id JOIN tag_table ON tag_reference = tag_table.id WHERE tag_name = 'Pineapple' OR tag_reference = 9 ; </code></pre> This would result in :- <img src="https://i.stack.imgur.com/DFXmg.jpg" alt="enter image description here"> <hr> <ul> <li>Note this is a simple overview e.g. you may want to consider having the mapping table as a WITHOUT ROWID table, perhaps have a composite UNIQUE constraint.</li> </ul> <h3>Additional re comment :-</h3> <blockquote> How do I implement a query that contains two or more tags at the same time? </blockquote> This is a little more complex if you want specific tags but still doable. Here's an example using a CTE (Common Table Expression) along with a HAVING clause (a where clause applied after the output has been generated, so can be applied to aggregates) :- <pre class="prettyprint"><code>WITH cte1(otm_oref,otm_tref,tt_id,tt_name, ot_id, ot_name) AS ( SELECT * FROM object_tag_mapping JOIN tag_table ON tag_reference = tag_table.id JOIN object_table ON object_reference = object_table.id WHERE tag_name = 'Pineapple' OR tag_name = 'Apple' ) SELECT ot_name, group_concat(tt_name), count() AS cnt FROM CTE1 GROUP BY otm_oref HAVING cnt = 2 ; </code></pre> This results in :- <img src="https://i.stack.imgur.com/6rUEE.jpg" alt="enter image description here">

What is the best way to design a tag-based data table with Sqlite?

Tags:

sqlite

full-text-search

android-sqlite

database-design

Json received from the server has this form.

[
 {
  "id": 1103333,
  "name": "James",
  "tagA": [
    "apple",
    "orange",
    "grape"
  ],
  "tagB": [
    "red",
    "green",
    "blue"
  ],
  "tagC": null
  },

  {
  "id": 1103336,
  "name": "John",
  "tagA": [
    "apple",
    "pinapple",
    "melon"
  ],
  "tagB": [
    "black",
    "white",
    "blue"
  ],
  "tagC": [
    "London",
    "New York"
    ]
  }
]

An object can have multiple tags, and a tag can be associated with multiple objects.

In this list, I want to find an object whose tagA is apple or grape and tagB is black.

This is the first table I used to write.

create table response(id integer primary key, name text not null, tagA text, 
tagB text, tagC text)

select * from response where (tagA like '%apple%' or tagA like '%grape%') and (tagB like '%black%')

This type of table design has a problem that the search speed is very slow because it does not support the surface function of the fts function when using ORM library such as Room.

The next thing I thought about was to create a table for each tag.

create table response(id integer primary key, name text not null)

create table tagA(objectID integer, value text, primary key(objectID, value))

create table tagB(objectID integer, value text, primary key(objectID, value))

create table tagC(objectID integer, value text, primary key(objectID, value))

select * from response where id in ((select objectId from tagA where value in ('apple','grape')) 
intersect
(select objectId from tagB where value in 'black'))

This greatly increases the insertion time and the capacity of the APK (roughly twice as much per additional table), but the search speed is far behind that of the FTS virtual table.

I want to avoid this as much as I use FTS tables because there are more things I need to manage myself.

There are a lot of things I missed (index etc.) but I can not figure out what it is.

How can I optimize the database without using the FTS method?

313

asked Jul 02 '18 03:07

H.Kim

1 Answers

You could use a reference table (aka mapping table along with a multitude of other names) to allow a many-many relationship between tags (single table for all) and objects (again single table).

So you have the objects table each object having an id and you have the tags table again with an id for each object. So something like :-

DROP TABLE IF EXISTS object_table;
CREATE TABLE IF NOT EXISTS object_table (id INTEGER PRIMARY KEY, object_name);
DROP TABLE IF EXISTS tag_table;
CREATE TABLE IF NOT EXISTS tag_table (id INTEGER PRIMARY KEY, tag_name);

You'd populate both e.g.

INSERT INTO object_table (object_name) VALUES
    ('Object1'),('Object2'),('Object3'),('Object4');
INSERT INTO tag_table (tag_name) VALUES
    ('Apple'),('Orange'),('Grape'),('Pineapple'),('Melon'),
    ('London'),('New York'),('Paris'),
    ('Red'),('Green'),('Blue'); -- and so on

The you'd have the mapping table something like :-

DROP TABLE IF EXISTS object_tag_mapping;
CREATE TABLE IF NOT EXISTS object_tag_mapping (object_reference INTEGER, tag_reference INTEGER);

Overtime as tags are assigned to objects or vice-versa you add the mappings e.g. :-

INSERT INTO object_tag_mapping VALUES
    (1,4), -- obj1 has tag Pineapple
    (1,1),  -- obj1 has Apple
    (1,8), -- obj1 has Paris
    (1,10), -- obj1 has green
    (4,1),(4,3),(4,11), -- some tags for object 4
    (2,8),(2,7),(2,4), -- some tags for object 2
    (3,1),(3,2),(3,3),(3,4),(3,5),(3,6),(3,7),(3,8),(3,9),(3,10),(3,11); -- all tags for object 3

You could then have queries such as :-

SELECT object_name, 
    group_concat(tag_name,' ~ ') AS tags_for_this_object 
FROM object_tag_mapping 
JOIN object_table ON object_reference = object_table.id
JOIN tag_table ON tag_reference = tag_table.id
GROUP BY object_name
;

group_concat is an aggregate function (applied per GROUP) that concatenates all values found for the specified column with (optional) separator.

The result of the query being :-

enter image description here

The following could be a search based upon tags (not that you'd likely use both tag_name and a tag_reference) :-

SELECT object_name, tag_name 
FROM object_tag_mapping 
JOIN object_table ON object_reference = object_table.id
JOIN tag_table ON tag_reference = tag_table.id
WHERE tag_name = 'Pineapple' OR tag_reference = 9
;

This would result in :-

enter image description here

Note this is a simple overview e.g. you may want to consider having the mapping table as a WITHOUT ROWID table, perhaps have a composite UNIQUE constraint.

Additional re comment :-

How do I implement a query that contains two or more tags at the same time?

This is a little more complex if you want specific tags but still doable. Here's an example using a CTE (Common Table Expression) along with a HAVING clause (a where clause applied after the output has been generated, so can be applied to aggregates) :-

WITH cte1(otm_oref,otm_tref,tt_id,tt_name, ot_id, ot_name) AS 
    (
        SELECT * FROM object_tag_mapping 
        JOIN tag_table ON tag_reference = tag_table.id 
        JOIN object_table ON object_reference = object_table.id
        WHERE tag_name = 'Pineapple' OR tag_name = 'Apple'
    )
SELECT ot_name, group_concat(tt_name), count() AS cnt FROM CTE1 
GROUP BY otm_oref
HAVING cnt = 2
;

This results in :-

enter image description here

answered Sep 21 '22 16:09

MikeT

Related questions
                            
                                Inserting datetime into a SQLite Database
                            
                                How to get parsed Date from Sqlite3 databse with Java?
                            
                                how to add column in a sqlite table in SWIFT
                            
                                Laravel/Eloquent 5 whereBetween not working as expected
                            
                                SQLite: *prevent* PRIMARY KEY value from resetting after delete all rows [duplicate]
                            
                                How to provide user defined function for python blaze with sqlite backend?
                            
                                Adding a custom sqlite function to a Qt application
                            
                                SQLite SUM() between several rows
                            
                                Unicode SQLDriverConnectW(): [unixODBC][Driver Manager]Data source name not found, and no default driver specified
                            
                                Ability to control byte size of INTEGER type
                            
                                How to check if record is present in sqlite in C
                            
                                Import SQLite database from sdcard to android - Without Assets Folder
                            
                                SQLite recursive query to return file path
                            
                                "Unable to locate the SpatiaLite library." Django
                            
                                Sqlite using Dapper. Error parsing column . Unable to cast object of type System.Int64 to type System.Double
                            
                                sqlite3 is not be able to import in python 3
                            
                                SQL error or missing database (table destination has x columns but y values were supplied)
                            
                                Python - Sqlite - OperationalError: near "s": syntax error [duplicate]
                            
                                How to load a SQLite3 extension in SQLAlchemy?
                            
                                Database is locked (defering rows didn't work)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the best way to design a tag-based data table with Sqlite?

Tags:

sqlite

full-text-search

android-sqlite

database-design

H.Kim

People also ask

1 Answers

Additional re comment :-

MikeT

Recent Activity

Donate For Us