Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to design a tag-based data table with Sqlite?

Json received from the server has this form.

[
 {
  "id": 1103333,
  "name": "James",
  "tagA": [
    "apple",
    "orange",
    "grape"
  ],
  "tagB": [
    "red",
    "green",
    "blue"
  ],
  "tagC": null
  },

  {
  "id": 1103336,
  "name": "John",
  "tagA": [
    "apple",
    "pinapple",
    "melon"
  ],
  "tagB": [
    "black",
    "white",
    "blue"
  ],
  "tagC": [
    "London",
    "New York"
    ]
  }
]

An object can have multiple tags, and a tag can be associated with multiple objects.

In this list, I want to find an object whose tagA is apple or grape and tagB is black.

This is the first table I used to write.

create table response(id integer primary key, name text not null, tagA text, 
tagB text, tagC text)

select * from response where (tagA like '%apple%' or tagA like '%grape%') and (tagB like '%black%')

This type of table design has a problem that the search speed is very slow because it does not support the surface function of the fts function when using ORM library such as Room.

The next thing I thought about was to create a table for each tag.

create table response(id integer primary key, name text not null)

create table tagA(objectID integer, value text, primary key(objectID, value))

create table tagB(objectID integer, value text, primary key(objectID, value))

create table tagC(objectID integer, value text, primary key(objectID, value))

select * from response where id in ((select objectId from tagA where value in ('apple','grape')) 
intersect
(select objectId from tagB where value in 'black'))

This greatly increases the insertion time and the capacity of the APK (roughly twice as much per additional table), but the search speed is far behind that of the FTS virtual table.

I want to avoid this as much as I use FTS tables because there are more things I need to manage myself.

There are a lot of things I missed (index etc.) but I can not figure out what it is.

How can I optimize the database without using the FTS method?

like image 313
H.Kim Avatar asked Jul 02 '18 03:07

H.Kim


People also ask

Does SQLite use schema?

Every SQLite database contains a single "schema table" that stores the schema for that database. The schema for a database is a description of all of the other tables, indexes, triggers, and views that are contained within the database.

How are tags stored in database?

Tags are stored in table "label" and once we add a tag to a ticket or anything, that gets stored in "label_entry" table.

How many tables can you create with SQLite?

Maximum Number Of Tables In A Join SQLite does not support joins containing more than 64 tables. This limit arises from the fact that the SQLite code generator uses bitmaps with one bit per join-table in the query optimizer.


1 Answers

You could use a reference table (aka mapping table along with a multitude of other names) to allow a many-many relationship between tags (single table for all) and objects (again single table).

So you have the objects table each object having an id and you have the tags table again with an id for each object. So something like :-

DROP TABLE IF EXISTS object_table;
CREATE TABLE IF NOT EXISTS object_table (id INTEGER PRIMARY KEY, object_name);
DROP TABLE IF EXISTS tag_table;
CREATE TABLE IF NOT EXISTS tag_table (id INTEGER PRIMARY KEY, tag_name);

You'd populate both e.g.

INSERT INTO object_table (object_name) VALUES
    ('Object1'),('Object2'),('Object3'),('Object4');
INSERT INTO tag_table (tag_name) VALUES
    ('Apple'),('Orange'),('Grape'),('Pineapple'),('Melon'),
    ('London'),('New York'),('Paris'),
    ('Red'),('Green'),('Blue'); -- and so on

The you'd have the mapping table something like :-

DROP TABLE IF EXISTS object_tag_mapping;
CREATE TABLE IF NOT EXISTS object_tag_mapping (object_reference INTEGER, tag_reference INTEGER);

Overtime as tags are assigned to objects or vice-versa you add the mappings e.g. :-

INSERT INTO object_tag_mapping VALUES
    (1,4), -- obj1 has tag Pineapple
    (1,1),  -- obj1 has Apple
    (1,8), -- obj1 has Paris
    (1,10), -- obj1 has green
    (4,1),(4,3),(4,11), -- some tags for object 4
    (2,8),(2,7),(2,4), -- some tags for object 2
    (3,1),(3,2),(3,3),(3,4),(3,5),(3,6),(3,7),(3,8),(3,9),(3,10),(3,11); -- all tags for object 3

You could then have queries such as :-

SELECT object_name, 
    group_concat(tag_name,' ~ ') AS tags_for_this_object 
FROM object_tag_mapping 
JOIN object_table ON object_reference = object_table.id
JOIN tag_table ON tag_reference = tag_table.id
GROUP BY object_name
;
  • group_concat is an aggregate function (applied per GROUP) that concatenates all values found for the specified column with (optional) separator.

The result of the query being :-

enter image description here

The following could be a search based upon tags (not that you'd likely use both tag_name and a tag_reference) :-

SELECT object_name, tag_name 
FROM object_tag_mapping 
JOIN object_table ON object_reference = object_table.id
JOIN tag_table ON tag_reference = tag_table.id
WHERE tag_name = 'Pineapple' OR tag_reference = 9
;

This would result in :-

enter image description here


  • Note this is a simple overview e.g. you may want to consider having the mapping table as a WITHOUT ROWID table, perhaps have a composite UNIQUE constraint.

Additional re comment :-

How do I implement a query that contains two or more tags at the same time?

This is a little more complex if you want specific tags but still doable. Here's an example using a CTE (Common Table Expression) along with a HAVING clause (a where clause applied after the output has been generated, so can be applied to aggregates) :-

WITH cte1(otm_oref,otm_tref,tt_id,tt_name, ot_id, ot_name) AS 
    (
        SELECT * FROM object_tag_mapping 
        JOIN tag_table ON tag_reference = tag_table.id 
        JOIN object_table ON object_reference = object_table.id
        WHERE tag_name = 'Pineapple' OR tag_name = 'Apple'
    )
SELECT ot_name, group_concat(tt_name), count() AS cnt FROM CTE1 
GROUP BY otm_oref
HAVING cnt = 2
;

This results in :-

enter image description here

like image 58
MikeT Avatar answered Sep 21 '22 16:09

MikeT