Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLite optimizing multi-select insert

Tags:

sql

sqlite

I've been using SQL for years now but rarely anything more that simple inserts and selects etc... so I'm no SQL expert. I'm wondering if I could get some help in optimizing a more complex SQL statement that I'm executing on SQLite, from PHP through PDO.

The statement seems to work correctly, just seems to take longer that I would have expected (or perhaps I'm just expecting too much).

This is the SQL:

INSERT OR IGNORE INTO MailQueue(SubscriberID, TemplateID)
    SELECT Subscribers.ID, '1' AS TemplateID
    FROM Subscribers 
    INNER JOIN SubscriberGroups ON Subscribers.ID=SubscriberGroups.SubscriberID
    WHERE SubscriberGroups.GroupID IN ('1', '2', '3')
    AND Subscribers.ID NOT IN 
        ( 
        SELECT Subscribers.ID FROM Subscribers 
        INNER JOIN SubscriberGroups ON Subscribers.ID=SubscriberGroups.SubscriberID
        WHERE SubscriberGroups.GroupID IN ('4', '5', '6')
        );

What I've got is a list of subscribers, in one or more groups. I want to add subscribers to a mail queue, selecting those that belong to one or more groups (1,2,3) but exclude those that are also in another set of groups (4,5,6).

Firstly, is the above SQL typical of how to do this?

Secondly, what indicies should I have to make this work as efficiently as possible?

Currently it takes about 30 seconds to get through about 5000 subscriber records (and a handful of groups) on a avg spec LAMP.

At the end of the day performance isn't that critical, but I'd like to understand this stuff better so any insight greatly appreciated.

Brad

like image 408
Brad Robinson Avatar asked Apr 10 '09 14:04

Brad Robinson


People also ask

Does SQLite have a query optimizer?

The query optimizer in SQLite has basically two choices on how to implement this query. (There are actually six different choices, but we will only consider two of them here.) Pseudocode below demonstrating these two choices. The same indexes are used to speed up every loop in both implementation options.

How many writes per second SQLite?

Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second. Transaction speed is limited by the rotational speed of your disk drive.

What is faster than SQLite?

With Actian Zen, developers and product managers get all the advantages of SQLite but in a powerful, secure, and scalable engine that can run serverless or as a client-server. Actian Zen is orders of magnitude faster than SQLite.


1 Answers

Chances are the extra joins are killing you. What if you do:

SELECT Subscribers.ID, '1' AS TemplateID
FROM Subscribers 
WHERE EXISTS( SELECT *
                FROM SubscriberGroups
               WHERE Subscribers.ID=SubscriberGroups.SubscriberID
                                 AND SubscriberGroups.GroupID IN ('1', '2', '3') )

  AND NOT EXISTS( SELECT *
                    FROM SubscriberGroups
                   WHERE Subscribers.ID=SubscriberGroups.SubscriberID 
                     AND SubscriberGroups.GroupID IN ('4', '5', '6')
    );

You'd also want to make sure that you have an index on SubscriberGroups( SubscriberID, GroupID )

My guess is that Subscribers already has an index on ID, right?

EDIT: Another option, which may or may not be faster. Look at the query plans of each to see...

This one may be a single index scan which could be faster than two index seeks, but it depends on SQLite's optimizer...

SELECT Subscribers.ID, '1' AS TemplateID
FROM Subscribers 
INNER JOIN( SELECT SUM( CASE WHEN GroupID IN('1', '2', '3') THEN 1 ELSE 0 END ) AS inGroup,
                   SUM( CASE WHEN GroupID IN('4', '5', '6') THEN 1 ELSE 0 END ) AS outGroup,
                   SubscriberID
                            FROM SubscriberGroups
                         WHERE SubscriberGroups.GroupID IN ('1', '2', '3', '4', '5', '6' )
          ) SubscriberGroups
       ON Subscribers.ID=SubscriberGroups.SubscriberID
      AND inGroup  > 0
      AND outGroup = 0
like image 118
Matt Rogish Avatar answered Oct 26 '22 23:10

Matt Rogish