Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is causing this memory leak when (inner) joining this table?

Tags:

php

mysql

I have SQL that in my head, would and should run in under 1 second:

SELECT mem.`epid`,
       mem.`model_id`,
       em.`UKM_Make`,
       em.`UKM_Model`,
       em.`UKM_CCM`,
       em.`UKM_Submodel`,
       em.`Year`,
       em.`UKM_StreetName`,
       f.`fit_part_number`
FROM `table_one` AS mem
INNER JOIN `table_two` em ON mem.`epid` = em.`ePID`
INNER JOIN `table_three` f ON `mem`.`model_id` = f.`fit_model_id`
LIMIT 1;

When I run in the terminal this SQL executes in 16 seconds. However, if I remove the line:

INNER JOIN `table_three` f ON `mem`.`model_id` = f.`fit_model_id`

then it executes in 0.03 seconds. Unfortunately for me, I'm not to sure how to debug MYSQL performance issues. This causes my PHP script to run out of memory trying to execute the query.

Here are my table structures:

table_one

+----------+---------+------+-----+---------+-------+
| Field    | Type    | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| epid     | int(11) | YES  |     | NULL    |       |
| model_id | int(11) | YES  |     | NULL    |       |
+----------+---------+------+-----+---------+-------+

table_two

+----------------+--------------+------+-----+---------+-------+
| Field          | Type         | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| id             | int(11)      | NO   | PRI | NULL    |       |
| ePID           | int(11)      | NO   |     | NULL    |       |
| UKM_Make       | varchar(100) | NO   |     | NULL    |       |
| UKM_Model      | varchar(100) | NO   |     | NULL    |       |
| UKM_CCM        | int(11)      | NO   |     | NULL    |       |
| UKM_Submodel   | varchar(100) | NO   |     | NULL    |       |
| Year           | int(11)      | NO   |     | NULL    |       |
| UKM_StreetName | varchar(100) | NO   |     | NULL    |       |
| Vehicle Type   | varchar(100) | NO   |     | NULL    |       |
+----------------+--------------+------+-----+---------+-------+

table_three

+-----------------+-------------+------+-----+---------+----------------+
| Field           | Type        | Null | Key | Default | Extra          |
+-----------------+-------------+------+-----+---------+----------------+
| fit_fitment_id  | int(11)     | NO   | PRI | NULL    | auto_increment |
| fit_part_number | varchar(50) | NO   |     | NULL    |                |
| fit_model_id    | int(11)     | YES  |     | NULL    |                |
| fit_year_start  | varchar(4)  | YES  |     | NULL    |                |
| fit_year_end    | varchar(4)  | YES  |     | NULL    |                |
+-----------------+-------------+------+-----+---------+----------------+

The above is output from describe $table_name

Is there anything that I'm obviously missing and if not, how can I try to find out why including table_three causes such a slow response time?

EDIT ONE:

After the indexing suggestion (used CREATE INDEX fit_model ON table_three (fit_model_id), it performs the query in 0.00 seconds (in MYSQL). Removing the limit, is still running from after doing the suggestion ... so not quite there. Anton's suggestion about using EXPLAIN I used it and got this output:

+------+-------------+-------+------+---------------+-----------+---------+----------------------+-------+-------------------------------------------------+
| id   | select_type | table | type | possible_keys | key       | key_len | ref                  | rows  | Extra                                           |
+------+-------------+-------+------+---------------+-----------+---------+----------------------+-------+-------------------------------------------------+
|    1 | SIMPLE      | mem   | ALL  | NULL          | NULL      | NULL    | NULL                 |  5587 | Using where                                     |
|    1 | SIMPLE      | f     | ref  | fit_model     | fit_model | 5       | mastern.mem.model_id |    14 |                                                 |
|    1 | SIMPLE      | em    | ALL  | NULL          | NULL      | NULL    | NULL                 | 36773 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+-----------+---------+----------------------+-------+-------------------------------------------------+

EDIT TWO

I've added a Foreign Key based on suggestions using the below query:

ALTER TABLE `table_one`
ADD CONSTRAINT `model_id_fk_tbl_three`
FOREIGN KEY (`model_id`)
REFERENCES `table_three` (`fit_model_id`)

MYSQL is still executing the command - there are a lot of rows, so half-expecting this behaviour. With PHP I can break up the query and build my array like that, so I guess that possibly solves the issue - thought is there anything more I can do to try and reduce execution time?

like image 832
treyBake Avatar asked Oct 08 '18 08:10

treyBake


1 Answers

Based on everyone's comments etc. I managed to perform a few things that made my query run a hell of a lot quicker and not crash my script.

1) Indexes

I created an index on my table_three for the field fit_model_id:

CREATE INDEX fit_model ON `table_three` (`fit_model_id`);

This made my LIMIT 1 query go from 16 seconds to 0.03 seconds execution time (in MYSQL CLI).

However, 100 rows or so would still take a lot longer than I thought.

2) Foreign Keys

I created a foreign key that linked table_one.model_id = table_three.fit_model_id using the below query:

ALTER TABLE `table_one`
ADD CONSTRAINT `model_id_fk_tbl_three`
FOREIGN KEY (`model_id`)
REFERENCES `table_three` (`fit_model_id`)

This definitely helped, but still felt like more could be done.

3) OPTIMIZE TABLE

I then used OPTIMIZE TABLE on these tables:

  • table_one
  • table_three

This then made my script work and my query fast as ever. However, the issue I had was a large data set, so I let, the query run in MYSQL CLI whilst increasing the LIMIT by 1000 each script run time to help the indexing process, got all the way to 30K rows before it started crashing.

CLI took 31 minutes and 8 seconds to complete. So I did this:

31 x 60 = 1860

1860 + 8 = 1868

1868 / 448476 = 0.0042

So each row took 0.0042 seconds to complete - which is fast enough in my eyes.

Thanks to everyone for commenting and helping me debug and fix the issue :)

like image 104
treyBake Avatar answered Oct 01 '22 16:10

treyBake