I'm working on an enterprise sales app, for iPad, that uses Sqlite as its internal database, and a strange behaviour recently showed up.
I have a huge table that is filled with information from several other tables (sort of like a "materialized view"), which can contain over 2 million rows, depending on how the user is set up. When the user wants to search for an item, the app performs a query on this huge table that has an indexed column and on other columns that are used as filters and/or metadata. I'll post the query and the basic idea below. Anyway, this query usually returns in 2~3 seconds on an iPad 4th gen, no more than that, and this is just fine. This table is dropped, re-created and filled every time the user taps a button to synchronize his data with our server.
However, recently the same query in the same table (with no relevant changes at all), randomly started to take 40~50 seconds. If you do the same thing later, on the same device, with the same filters (or even changing the filters!), the same query on the same table takes the 2~3 seconds again. I haven't found any specific situation that causes this slowdown, the app is the only one running at that time. The device is not the problem, we've seen this happen on at least 5 different iPads, one is an iPad 3 and the others are iPads 4th gen.
I don't think it is some sort of caching, since the app does not cache anything, and these times are rather random. Sometimes they take 40 seconds for 10 times in a row, then suddenly it starts to take only 2 seconds again, and the same thing the other way. The only thing that is clear to me is that this slowdown only occurs after intensive use (1 - 2 days of work using the app), so I'm also having troubles to cause this behaviour while debugging on the iPad I have with me.
What I've tried:
The huge table is constructed as the following, and does NOT have any foreign keys or other any other constraint:
CREATE TABLE FMV_CATALOG(
UNIQUE_ID TEXT,
PRODUCT_ID INTEGER,
<bunch of metadata/filtered columns - total of 20 columns>
);
And the query that is made to find the products is:
SELECT
PRODUCT_ID
,UNIQUE_ID
<all other required columns, ~20 columns>
FROM
FMV_CATALOG
WHERE
UNIQUE_ID = '<some id>_<other id>'
AND PRODUCT_NAME LIKE '%iPhone%'
<and other optional, rarely used, filters.>
I'm totally out of ideas, so any help will be appreciated.
Thanks!
UPDATE (more info):
Important informations that I forgot to mention, Rob reminded me of it. My database connection is always open, it is closed only when the user logs out. We've noticed a huge performance on all parts of the app when we kept the connection opened, since we have hundreds of small queries that are executed on other situations (but not while browsing/searching the products catalog).
The query used to create the index is below:
CREATE INDEX IDX_MV_CATALOG ON MV_CATALOG(UNIQUE_ID);
Also, even though the column is named UNIQUE_ID, it is not unique. It was supposed to be originally, but now it is repeated N times. I know this is wrong, we'll change that ASAP.
This "UNIQUE_ID" (which is not really unique) is filled by joining the IDs of two other tables. This way, our "materialized view" removes the need of at least three joins when the user searches on our catalog, which improved our query times from ~20 seconds to ~2 seconds.
We don't call sqlite3 API directly on our queries, we have developed a wrapper class around it and we've been using it for at least 2 years now. And it's the first time ever we've been on this situation, but again it's the first time we're handling so much data.
A couple of thoughts:
You're not showing us the creation of any index on FMV_CATALOG
. If nothing else, if UNIQUE_ID
is, as the name suggests, unique, then I'd be inclined to define the table with a PRIMARY KEY
:
CREATE TABLE FMV_CATALOG(
UNIQUE_ID TEXT PRIMARY KEY,
PRODUCT_ID INTEGER,
<bunch of metadata/filtered columns - total of 20 columns>
);
You should try using the SQLite EXPLAIN QUERY PLAN
command to look at the query and look at its plan and make sure it's availing itself of your index. Do this as it is, and then again with PRIMARY KEY
(and perhaps if that still doesn't do it, an index on the fields in your WHERE
clause), and make sure the final query is definitely using your index.
I'm not sure why, if you have the unique id, why you're also looking at the other fields. If adding of the primary key (and possibly other index(es)) doesn't solve the problem, I might try just retrieving the record based upon the unique id, and then check for conformance with your other parameters in code. I don't believe you need to do this, but it's a worst case scenario.
In terms of why it will slow down, that's harder to guess what's going on without seeing the code (which I'm sure is too complicated to share in a simple S.O. question). I could imagine strange behavior if, for example, you fail to sqlite3_finalize
after one of your sqlite3_prepare_v2
statements or if you accidentally failed to close the database and then opened it again elsewhere. I could imagine performance issues that might come in place if the sequence of sqlite3
calls wasn't precisely right. Use of something like FMDB can minimize the chance of those sorts of issues occurring (as well as simplifying your SQLite code). Or, if that's too radical of a step, try writing your own macros that call the SQLite calls, but also log the fact that you've called that sqlite3
function, and pour through that log and double check the sequence of your SQLite calls.
The only thing I can suggest is whether you can construct a simplified project that can reproduce the aberrant behavior. Tracking down a Heisenbug can be infuriating: Unless you can consistently reproduce the bug, it's hard to track down.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With