I'd like to use SQLite FTS3 (FTS4, actually) to index a table with integer columns, conceptually something like this:
CREATE VIRTUAL TABLE whole (document INTEGER, page INTEGER, content TEXT,
UNIQUE(document, page)) USING fts4();
I know that FTS3 treats all columns other than rowid as TEXT, so I'll have to use two tables:
CREATE VIRTUAL TABLE data USING fts4();
CREATE TABLE metadata(document INTEGER, page INTEGER, UNIQUE(document, page));
I want to be able to query for documents, or for pages in a given document:
SELECT DISTINCT document FROM metadata NATURAL JOIN data WHERE content MATCH 'foo';
SELECT page FROM metadata NATURAL JOIN data
WHERE document = 123 AND content MATCH 'foo';
I think the NATURAL JOIN requires me to ensure that the rowids are kept in sync, but what's the best way to do that? Should I be using a FOREIGN KEY or other constraint? Would a sub-select be better than a join?
I'd like an insertion for a document and page already in the database to overwrite the text content. Is that possible programmatically via SQL or will I have to check to see if the row already exists in the info table?
I'm also going to be wanting to DELETE FROM both tables for a given document -- is there a way to do this in a single statement?
All advice gratefully received, but as I am a SQL newbie, code samples particularly appreciated!
Update: It's not at all clear to me how I can create a foreign key constraint here. If I choose metadata
as the parent table (which would be my preference, in the absence of a bidirectional constraint):
PRAGMA foreign_keys = ON;
CREATE TABLE metadata (document INTEGER, page INTEGER);
CREATE VIRTUAL TABLE data USING fts4(content TEXT, docid REFERENCES metadata);
I get Error: vtable constructor failed: data
(unsurprisingly, because docid
is an alias for rowid
, but of course I can't use another column because all columns except rowid
must be TEXT
).
Whereas if I try the other way round:
PRAGMA foreign_keys = ON;
CREATE VIRTUAL TABLE data USING fts4();
CREATE TABLE metadata (document INTEGER, page INTEGER, docid REFERENCES data);
the table construction succeeds, but if I try:
INSERT INTO data (docid, content) VALUES (123, 'testing');
INSERT INTO metadata (docid, document, page) VALUES (123, 12, 23);
I get Error: foreign key mismatch
.
In short, it's pretty difficult to enforce consistency where FTS3 is involved.
SQLite virtual tables don't allow for triggers, and FTS3 tables ignore constraints and affinities.
The best I have been able to do so far is as follows:
CREATE TABLE metadata (document INTEGER, page INTEGER, UNIQUE(document, page));
CREATE VIRTUAL TABLE data USING fts4();
CREATE VIEW whole AS SELECT metadata.rowid AS rowid, document, page, content
FROM metadata JOIN data ON metadata.rowid = data.rowid;
CREATE TRIGGER whole_insert INSTEAD OF INSERT ON whole
BEGIN
INSERT INTO metadata (document, page) VALUES (NEW.document, NEW.page);
INSERT INTO data (rowid, content) VALUES (last_insert_rowid(), NEW.content);
END;
CREATE TRIGGER whole_delete INSTEAD OF DELETE ON whole
BEGIN
DELETE FROM metadata WHERE rowid = OLD.rowid;
DELETE FROM data WHERE rowid = OLD.rowid;
END;
To enforce consistency I could (with PRAGMA recursive_triggers = NO
) create triggers to raise exceptions on direct operations on the metadata
and data
tables, but this is probably overkill for my purposes (likewise, I don't need the UPDATE
trigger for the whole
table).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With