I have the following tables in question:
My query looks like this, the EXPLAIN results are attached below:
SELECT
DISTINCT (Personas.id),
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM personas as Personas
INNER JOIN
impressions_personas ImpressionsPersonas ON (
Personas.id = ImpressionsPersonas.persona_id
)
inner JOIN impressions Impressions ON (Impressions.id = ImpressionsPersonas.impression_id AND Impressions.timestamp >= "2016-06-01 00:00:00" AND Impressions.timestamp <= "2016-07-31 00:00:00")
EXPLAIN
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
| 1 | SIMPLE | Personas | ALL | PRIMARY | NULL | NULL | NULL | 159 | 100.00 | Using temporary |
| 1 | SIMPLE | ImpressionsPersonas | ref | impression_idx,persona_idx,comp_imp_persona,comp_imp_pri,comp_per_pri | persona_idx | 8 | gen1_d2go.Personas.id | 396 | 100.00 | Distinct |
| 1 | SIMPLE | Impressions | eq_ref | PRIMARY,timestamp,timestamp_id | PRIMARY | 8 | gen1_d2go.ImpressionsPersonas.impression_id | 1 | 100.00 | Using where; Distinct |
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)
CREATE STATEMENT FOR PERSONAS
CREATE TABLE `personas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`parent_id` bigint(20) unsigned DEFAULT NULL,
`persona` varchar(150) NOT NULL,
`subpersonas_count` int(10) unsigned DEFAULT '0',
`is_subpersona` tinyint(1) unsigned DEFAULT '0',
`impressions_count` bigint(20) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `lookup` (`parent_id`,`persona`),
KEY `parent_index` (`parent_id`),
KEY `persona` (`persona`),
KEY `persona_a_id` (`id`,`persona`),
CONSTRAINT `self_referential_join_to_self` FOREIGN KEY (`parent_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1049 DEFAULT CHARSET=utf8;
CREATE STATEMENT FOR IMPRESSIONS_PERSONAS
CREATE TABLE `impressions_personas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`impression_id` bigint(20) unsigned NOT NULL,
`persona_id` bigint(20) unsigned NOT NULL,
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `impression_idx` (`impression_id`),
KEY `persona_idx` (`persona_id`),
KEY `comp_imp_persona` (`impression_id`,`persona_id`),
KEY `comp_imp_pri` (`impression_id`,`id`),
KEY `comp_per_pri` (`persona_id`,`id`),
CONSTRAINT `impression` FOREIGN KEY (`impression_id`) REFERENCES `impressions` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `persona` FOREIGN KEY (`persona_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=19387839 DEFAULT CHARSET=utf8;
CREATE STATEMENT FOR IMPRESSIONS
CREATE TABLE `impressions` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`device_id` bigint(20) unsigned NOT NULL,
`beacon_id` bigint(20) unsigned NOT NULL,
`zone_id` bigint(20) unsigned NOT NULL,
`application_id` bigint(20) unsigned DEFAULT NULL,
`timestamp` datetime NOT NULL,
`google_place_id` bigint(20) unsigned DEFAULT NULL,
`name` varchar(60) DEFAULT NULL,
`lat` decimal(15,10) DEFAULT NULL,
`lng` decimal(15,10) DEFAULT NULL,
`personas_count` int(10) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `device_idx` (`device_id`),
KEY `zone_idx` (`zone_id`),
KEY `beacon_id_idx2` (`beacon_id`),
KEY `timestamp` (`timestamp`),
KEY `appid_fk_idx_idx` (`application_id`),
KEY `comp_lookup` (`device_id`,`beacon_id`,`timestamp`),
KEY `timestamp_id` (`timestamp`,`id`),
CONSTRAINT `appid_fk_idx` FOREIGN KEY (`application_id`) REFERENCES `applications` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `beacon_id` FOREIGN KEY (`beacon_id`) REFERENCES `beacons` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `device2` FOREIGN KEY (`device_id`) REFERENCES `devices` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `zone_FK` FOREIGN KEY (`zone_id`) REFERENCES `zones` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1582724 DEFAULT CHARSET=utf8;
Now - when I run the query without the DISTINCT and using a COUNT(*)
, it pulls about 17,000,000 records. Running it with DISTINCT yields 112 records. I am not sure why there are so many records showing up when the explain showed only 159 and 396.
Some information about the tables:
The Personas table contains 159 records. The ImpressionsPersonas table contains about 12.6 million, and Impressions contains about 920,000 records.
What we are doing is selecting the Personas
table and joining to the Impressions by way of the join table ImpressionsPersonas
. There are filters applied to the Impressions
table (date in this case).
Note: removing the date filter had a negligible impact on the execution time - which hovers right around 120s. Is there a way to filter these records down to cut down the execution time of this query?
I presume that you want to get the list of persons who have at least 1 impression within a specified time period. To get this, you can use such a correlated sub-query:
SELECT
Personas.id,
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM personas as Personas
WHERE EXISTS(SELECT 1 FROM impressions_personas
LEFT JOIN impressions Impressions ON
Impressions.id = ImpressionsPersonas.impression_id
WHERE Personas.id = ImpressionsPersonas.persona_id
AND Impressions.timestamp >= "2016-06-01 00:00:00"
AND Impressions.timestamp <= "2016-07-31 00:00:00"
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With