Overview:
I have three tables 1) subscribers, bios, and shirtsizes and i need to find the subscribers without a bio or shirtsizes
the tables are laid out such as
subscribers
| season_id | user_id |
bio
| bio_id | user_id |
shirt sizes
| bio_id | shirtsize |
And I need to find all users who do not have a bio or shirtsize, (if no bio; then no shirtsize via relation) for any given season.
I originally wrote a query like:
SELECT *
FROM subscribers s
LEFT JOIN bio b ON b.user_id = subscribers.user_id
LEFT JOIN shirtsizes ON shirtsize.bio_id = bio.bio_id
WHERE s.season_id = 185181 AND (bio.bio_id IS NULL OR shirtsize.size IS NULL);
but it is taking 10 seconds to complete now.
I am wondering how I can restructure the query (or possibly the problem) so that it will preform reasonably.
Here is the mysql explain: (ogu = subscribers, b = bio, tn = shirtshize)
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------------+--------+-------------+
| 1 | SIMPLE | ogu | ref | PRIMARY | PRIMARY | 4 | const | 133 | Using where |
| 1 | SIMPLE | b | index | NULL | PRIMARY | 8 | NULL | 187644 | Using index |
| 1 | SIMPLE | tn | ref | nid | nid | 4 | waka2.b.nid | 1 | Using where |
The above is pretty sanitized, here's the realz info:
mysql> DESCRIBE subscribers
+-----------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| subscribers | int(11) | NO | PRI | | |
| uid | int(11) | NO | PRI | | |
mysql> DESCRIBE bio;
+-------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| bio_id | int(10) unsigned | NO | PRI | 0 | |
| uid | int(10) unsigned | NO | PRI | 0 | |
mysql> DESCRIBE shirtsize;
+-------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| bio_id | int(10) unsigned | NO | PRI | 0 | |
| shirtsize | int(10) unsigned | NO | PRI | 0 | |
and the real query looks like:
SELECT ogu.nid, ogu.is_active, ogu.uid, b.nid AS bio_node, tn.nid AS size
FROM og_uid ogu
LEFT JOIN bio b ON b.uid = ogu.uid
LEFT JOIN term_node tn ON tn.nid = b.nid
WHERE ogu.nid = 185033 AND ogu.is_admin = 0
AND (b.nid IS NULL OR tn.tid IS NULL)
nid is season_id or bio_id (with a type); term_node is going to be the shirtsize
The LEFT JOIN query is slower than the INNER JOIN query because it's doing more work. From the EXPLAIN output, it looks like MySQL is doing nested loop join.
Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.
The query should be OK. I would run it through a query analyzer and refine the indexes on the tables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With